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A two  level  sequential  decision  formulation  for  the  control  of 
interconnected  stochastic  linear  discrete-time  systems  is  investigated. 
An  interconnection  of  several  systems  is  considered,  whereby  each 
subsystem  has  a decision  maker  and  an  associated  quadratic  cost 

function.  One  of  the  decision  makers  is  designated  as  a leader  or 

coordinator  and  his  control  strategies  are  to  be  chosen  prior  to  those 
of  the  others.  The  information  available  to  each  decision  maker  may  be 
different  from  those  of  the  others.  The  second  level  decision  makers 
are  regarded  as  followers  in  the  context  of  Stackelberg  strategies. 
Their  strategies  are  in  accordance  with  the  Nash  equilibrium  concept  or 
Pareto  optimal  concept  except  that  the  coordinator 's  strategy  is  known 
to  all  of  them.  The  coordinator  chooses  his  strategy  under  the 
assumption  that  the  followers  will  fully  exploit  the  prior  announcement 
of  his  strategy.  Centralized  and  decentralised  information  are 
considered.  Dynamic  programming  is  employed  to  derive  the  recursive 
equations  for  determining  the  control  laws  for  each  subsystem. 
Decentralized  information  structure  is  more  attractive  since  each 
subsystem  control  .law  is  based  only  on  local  measurements.  However  a 

two-point  boundary  value  problem  has  to  be  solved.  A simple  algorithm 

is  suggested  but  conditions  for  convergence  are  not  yet  available. 
Finally,  a decentralized  Stackelberg  strategies  for  an  interconnected 
power  system  is  suggested.  The  design  procedure  emphasizes  proportional 
plus  integral  control  in  the  context  of  Stackelberg  strategies. 
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! . INTRODUCTION 

1 . 1 Introduction 

A multi-level  structure  for  a large  scale  system  appears  rather 
naturally  in  practice.  It  is  the  consequence  of  an  effort  toward 
efficient  utilization  of  the  available  resources  or  the  inherent 
limitations  of  the  elements  out  of  which  the  system  is  built.  An 
interconnected  power  system  provides  an  important  example  of  a class  of 
large-scale  systems. 

A significant  development  in  large-scale  system  theory  is  the 
concept  of  multi-person  stochastic  games  with  nonclassioal  information 
patterns  and  their  implications  on  decentralized  and  hierarchical 
control  strategies  [ 1 ,4 , 12-15 ,42 ,43 ,45 ,60 ,62] . It  is  evident  that  a 
theory  of  coordination  using  the  bargaining  approach  [ 1 5 3 is  an 
important  and  interesting  avenue  for  new  research. 

The  main  object  of  this  thesis  is  to  investigate  Staekelberg 
coordination  for  decentralized  stochastic  control.  A strong  motivation 
for  this  study  is  its  potential  application  to  decentralize  control 
problems  such  as  those  found  in  an  interconnected  power  system  which  can 
be  described  as  a collection  of  subsystems,  each  of  which  is  called  a 
control  area.  Each  area  is  responsible  for  meeting  its  obligation  to 
maintain  the  appropriate  system  frequency  and  supply  its  own  load 
demand.  Also,  each  area  provides  mutual  assistance  to  its  neighbours  in 
accordance  with  the  basic  operating  policy  of  interconnected  power 
systems  [23].  When  the  interconnected  network  is  small  centralized 
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techniques  can  be  used  quite  effectively  [19,20,23,25,33,39,57]. 
However,  in  the  more  general  case  the  communications/oomputational  costs 
involved  in  implementing  a centralized  controller  often  become 
prohibitive  and  decentralized  of  some  sort  becomes  essential. 


We  will  investigate  both  the  theoretical  framework  and  a potential 
practical  application  of  Stackelberg  coordination  for  decentralized 
stochastic  control  of  general  organizational  forms  of  large  scale 
system.  These  systems  may  be  controlled  by  multiple  decision  makers 
having  different  models,  different  information  sets  and  different 
objective  functionals.  Our  approach  will  be  based  on  differential  games 
[16-18,1)7-53],  stochastic  control  [2-3,5,30,37,38,42,54,63,61)]  and 
electric  power  system  control  [19,20,23,25,33,39,57] 


1 . 2 Literature  Survey 

The  design  of  large,  complex  systems  invariably  involves 
decomposition  of  the  system  into  a number  of  smaller  subsystems  each 
with  its  own  objective  functions  and  constraints  [40].  The  resulting 
interconnection  of  subsystems  may  take  on  many  forms,  but  one  of  the 
most  common  is  the  hierarchical  form  in  which  a given  level  subsystem 
controls  or  coordinates  the  subsystems  on  the  level  below  it  and  in  turn 
is  controlled  by  the  subsystems  on  the  level  above  it.  The  information 
available  to  a subsystem  on  a given  level  and  the  way  such  a subsystem 
can  make  use  of  the  information  to  influence  or  control  another 
subsystem  has  been  the  object  of  much  study. 
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Decentralised  information  among  decision  makers  was  first  studied 
in  the  static  team  theory  of  Radner  [44].  For  the  dynamic  cases,  H.S. 
Mitsenhausen  C 6 1 -6 3 ] was  the  first  who  showed  that  the  linear  ouadratic 
Gaussian  problem  is  nontrivial  when  the  information  pattern  is 
nonclassical.  Chong  and  Athans  [lo]  imposed  constraints  on  the  control 
structure  of  the  LQC-  system  having  different  information  sets.  They 
showed  that  the  parameter  matrices  of  each  dynamic  controller  could  then 
be  globally  optimized  by  solving  a deterministic  matrix  optimal  control 
problem.  Ho  and  Chu  [13,27]  have  demonstrated  that  certain  nonclassical 
stochastic  control  problems  admit  a linear  solution.  Sandell  and  Athans 
[45]  have  shown  that  LOG  problems  with  a unit  time  delay  of  information 
exchange  admit  a linear  optimal  decision  rule,  which  can  be  calculated 
expiicitiy.  The  results  appeared  to  be  promising  as  far  as  their 
applicability  to  decentralized  control  theory  is  concerned.  With 
decentralized  information,  there  is  a trade-off  between  information 
efficiency  and  computation  efficiency.  Chong  and  Athais  [14]  assumed 
that  the  "coordinator"  was  allowed  to  "interfere"  only  once  in  a while. 
When  the  coordinator  is  acting  open-loop  the  lower  level  problems  can  be 
decomposed  completely. 

Although  different  infomation  sets  are  available  to  each 
controller,  there  is  cooperation  among  the  different  controllers  because 
they  all  try  to  minimize  the  same  cost  functional  in  the  framework  of 
team  theory.  This  type  of  a situation  can  be  described  as  the 
"cooperative  and  partially  decentralized  case"  in  large  scale  system 


theory. 
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It  appears  that  a theory  of  coordination  using  the  bargaining 
approach  could  also  bo  developed  using  the  same  framework.  It  certainly 
represents  an  Important  and  Interesting  avenue  for  new  research.  This 
has  not  been  attempted  until  very  recently.  Crus  [15]  proposed  the 
extension  of  Stackelberg  strategies  to  the  coordination  of  several 
subsystems. 

One  could  natural!/  expect  that  game  theory  is  of  considerable  use 
in  bargaining.  In  fact,  game  theory  has  already  been  used  to  study 
bargaining  type  situation  between  organizations  in  an  economy  or  a 
soclaty  [58,59].  The  idea  of  using  control  theory  to  solve  games  with 
dynamic  evoluatlon  was  Initiated  by  Isaacs  [29].  The  games  Isaacs 
studied  were  primarily  deterministic  zero-sum  games.  Later  a more 
general  concept  of  differential  games  known  as  the  theory  of  N-player 
differential  games  has  been  introduced.  Starr  and  Ho  [!I7,*18]  considered 
non-zero  sum  differential  games  with  solution  concepts  or  rationales 
such  as  Mash,  Pareto  and  mlnimnx  In  a dynamic  sense.  The  concepts  of 
closed-loop  and  open-loop  solutions  wore  adapted  from  modern  control 
theory  to  dynamic  game  theory,  and  relates  to  the  class  of  admissible 
strategies.  fn  particular,  Interest  has  been  focused  on  the 
determination  of  Nash  ooul librium  strategies  for  deterministic 
linear-quadratic  nonzero-sum  differential  games  with  dynamic  information 
structures  [i|5-ilb].  Most  of  the  equilibrium  solutions  found  in  the 
literatures  for  such  games  have  b«en  linear  in  the  information  available 
to  each  player.  Only  Recently,  r.  Pasnr  [8]  has  shown,  via  a 


counterexample,  that  when  at.  least  one  of  the  players  has  access  to 
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closed-loop  information,  such  games  admit  non-unique  and  nonlinear 
equilibrium  solutions.  Recently,  Cruz  et  al.  [16,49-533  have 
introduced  the  Stackelberg  strategy  developed  in  static  games  [58]  to 
dynamic  games.  The  feedback  Stackelberg  solution  concept  [17]  ha3  been 
extended  to  a class  of  stochastic  games  by  Castanon  and  Athans  [18]. 

The  theory  of  stochastic  dynamic  games  is  based  on  the  works  of 
Witsenhausen  [61-64],  but  earlier,  Rhodes  and  Luenberger  [42,43],  and 
Behn  and  Ho  [4]  considered  the  problem  of  zero-sum  dynamic  games  with 
imperfect  information.  The  restriction  of  the  transfer  of  information 
through  decision  was  discussed  by  Aoki  [1]  while  considering  equilibria 
in  Nash  games. 

Interconnected  electric  energy  systems  provide  an  important  example 
of  a class  of  large-scale  systems.  In  several  papers 
[19,20,23,25,33,39,57],  attempts  have  been  made  to  analyze  the  load 
frequency  controller  of  an  interconnected  power  system  via  modern 
optimal  control  theory.  Since  the  solution  proposed  by  Elgerd  [25]  is 
based  on  the  standard  linear  regulator  theory  for  disturbance  free 
dynamic  systems,  it  neither  eliminates  the  steady-state  errors  of 
frequency  and  tie-line  flows,  caused  by  system  load  disturbance,  nor 
provides  the  desired  generation  distribution.  However,  a resonable 
dynamic  model  was  given.  A new  design  procedure  for  load  and  freauency 
control  was  developed  later  by  Calovic  [19,20]  which  avoids  all  the 
3hort  comings  of  previous  solutions.  The  procedure  used  is  to  adjoin 
the  integral  of  each  area  control  error  (ACE)  to  the  system  state 
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variables.  These  new  state  variables  as  well  as  the  original  system 
state  variables  are  included  in  the  cost  functional.  As  a result,  all 
areas  capable  of  doing  so  will  drive  their  area  control  errors  to  zero 
in  steady-state  provided  the  system  is  stable.  Recently,  Kwatny  [33] 
suggested  that  when  energy  source  response  limitations  are  recognized, 
the  load  frequency  control  (LFC)  problem  should  be  viewed  as  a 
"tracking"  problem  rather  than  a "regulator"  problem.  The  estimation 
and  prediction  of  load  are  used  to  coordinate  generation  in  each  area  so 
as  to  regulate  power  flows  and  frequencies. 

1 .3  Problem  Area  and  Methodology 

The  coordination  of  a large  scale  system,  which  has  the  following 
characteristics  C 1 5 ] : 1.  two  or  more  decision  makers  having  different 
models,  2.  different  information  sets  available  to  the  decision  makers, 
and  3.  different  objective  functionals,  using  differential  games 
approach  represents  an  important  and  incerescing  area  for  research.  We 
will  investigate,  in  details,  Stackelberg  Strategies  for  multilevel 
systems.  The  leader  who  acts  as  a coordinator  and  other  decision  makers 
who  are  viewed  as  followers  assume  different  models  of  the  same  system. 
Several  classes  of  information  structures  available  to  the  decision 
makers  will  be  discussed. 

First,  we  consider  an  interconnection  of  M discrete-time  linear 
stochastic  subsystems  and  associate  with  each  subsystem  a 
decision-maker,  a quadratic  performance  criterion,  and  a linear  noisy 
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measurement.  Superimposed  on  the  interconnection  is  an  addition 
decision  maker  called  the  coordinator  acting  through  an  additional 
discrete-time  linear  stochastic  subsystem,  with  a separate  quadratic 
performance  criterion  and  a separate  linear  noisy  measurement.  The 
coordinator  i3  viewed  as  a leader  and  the  other  decision-makers  as 
followers  assuming  Mash  rationale  or  Pareto  rationale  among  themselves. 
The  Stackelberg  equilibrium  strategy  [17]  is  extended  to  fit  this 
situation  when  there  is  one  leader  and  many  followers. 

When  all  decision  makers  have  perfect  system  measurement,  or  when 
all  the  information  of  all  the  followers  are  identical  and  the 
coordinator's  information  contains  the  followers'  information,  feedback 
control  structure  will  be  sought  based  on  the  stochastic  Stackelberg 
equilibrium  strategy  [18].  The  following  special  cases  will  also  be 
examined:  1.  when  the  coordinator  has  perfect  measurement  and  all  the 

followers  have  indentical  noisy  measurement,  and  2.  when  the 
coordinator  has  noisv  measurement  and  all  the  followers  have  no 
measurement . 

The  classes  of  information  structure  are  not  too  realistic  but  they 
provide  some  insight  into  the  more  complex  and  realistic  cases  treated 
subsequently.  Satisfactory  control  of  a high  order  system  may  often  be 
achieved  using  relatively  few  measurements  and  a controller  of 
relatively  low  order.  This  has  been  the  motivation  for  a number  of 
design  procedures  using  output  feedback  or  dynamic  controllers  of  a 
specified  order  [31 ,32, 38, 5ll, 63,6*1  ] . Although,  the  assumption  ot 
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linearity  in  the  class  of  instantaneous  feedback  control  laws  might  lead 
to  results  far  from  optimal  which  was  pointed  out  by  Witsenhausen  [61] 
and  Basar  18],  the  practical  need  for  simplifying  approximations  becomes 
more  acute  in  decentralized  control  when  there  are  many  separate 
controllers.  Decentralized  Stackelberg  strategies  which  are  constrained 
to  be  linear  dynamic  controllers  of  specified  orders,  will  be 
determined.  This  control  policy  has  the  obvious  advantage  of  being 
structurally  simpler  to  implement  since  it  does  not  require  memory  of 
past  meausurements . However,  there  exist,  at  present,  no  stability 
results  for  this  algorithm. 

Finally,  decentralized  Stackelberg  strategies  will  be  used  to 
develop  a decentralized  controller  for  a three-area  electric  power 
system.  This  design  procedure  meets  all  the  performance  requirements  of 
load  and  frequency  control,  i.e.  control  law  independent  of 
disturbance,  zero  steady-state  offsets  of  frequency  and  tie-lie  exchange 
variations  and  optmal  transient  performance.  The  dynamic  model, 
developed  by  Elgerd  and  Fosha  [25]  and  Calovic  [19]  will  be  used.  To 
overcome  the  problem  of  zero  steady  state  offsets  of  frequency  and 
tie-lie  exchange  variations,  the  integral  of  each  area  control  error 
(ACE)  is  adjoined  to  the  system  equations.  These  new  state  variables 
are  included  in  the  cost  functional.  So  as  Stackelberg  decentralized 
are  concerned,  each  control  area  is  constrained  to  feedback  only  its  own 
measurement  and  they  have  their  own  choice  of  cost  functional.  The  area 
which  has  superiority  in  computing  his  strategy/collecting  information, 
will  be  declared  as  a coordinator  who  coordinates  the  other  areas  which 
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are  viewed  as  followers.  When  the  lower-level  subsystems  desire  to 
cooperate  among  themselves  a Pareto  ootimal  solution  will  be  chosen, 
otherwise  Nash  equilibrium  solution  will  be  chosen.  The  algorithm  for 
obtaining  decentralized  controllers  is  developed  and  applied  to 
load-frequency  control  of  interconnected  power  systems.  The 
computational  algorithm  suggested  can  not  guarantee  satisfactory 
results.  However,  in  practice  the  algorithm  has  exhibited  rapid 
convergence. 

i • '•  Qcaaxtlaa.tl’an  itC  ilia  Kadi 

In  Chapter  2,  three  important  strategies  in  Games  theory,  i.e. 
Nash  equilibrium,  Pareto  optimal  and  Stackelberg  equilibrium  arc 
discussed.  The  necessary  conditions  for  the  three  strategies  applied  to 
a linear  quadratic  Gaussian  discrete  game  are  reviewed. 

Chapters  3 and  t deal  with  Stackelberg  coordination.  Centralized 
and  decentralized  information  structure  are  studied  in  this  context. 
Decentralized  structure  is  more  attractive  since  the  control  sequences 
are  function  of  the  measurable  output  only.  The  general  approach  is  to 
designate  one  subsystem  to  be  a coordinator  or  leader  who  coordinates 
the  rest  of  the  subsystems  who  are  viewed  as  followers.  Among  the 
followers  a Pareto  optimal  or  Nash  equilibrium  solution  is  selected 
according  to  their  decisions  to  cooperate  or  not.  These  concepts  along 


with  the  solutions  are  derived. 
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In  Chapter  5,  the  algorithm  to  solve  the  decentralized  stochastic 
Stackelberg  coordination  suggested  in  Section  3.1*  is  investigated 
further.  A three-area  interconnected  power  system,  which  is  a class  of 
large  scale  system,  is  selected  as  our  example.  The  design  procedure 
emphasizes  proportional-plus-integral  feedback  control.  A simulation 


study  is  presented. 


2.  LINEAR  QUADRATIC  DIFFERENTIAL  GAMES 


1 1 


In  this  chapter,  some  important  aspects  of  nonzero-sum  games  that 
are  pertinent  to  this  work  are  reviewed.  We  will  consider  a special 
class  of  differential  games,  where  the  system  is  linear  and  the  cost 
functions  are  quadratic  functions  of  the  state  vectors  and  controls, 
which  is  probably  the  only  non-trivial  class  of  differential  sames  in 
which  solutions  based  on  any  rationale  can  be  obtained  analytically 
without  difficulty. 


In  differential  games,  one  must  choose  a solution  concept  such  as, 
Nash  equilibrium,  noninferiority , Stackelberg  equilibrium  etc..  One 
must  also  specify  what  information  is  available  to  each  plaver  during 
the  course  of  the  game.  Extensive  work  has  been  done  on  deterministic 
nonzero-sum  differential  games  with  particular  emphasis  given  on 
two- person  gtmes  of  linear  quadratic  form  l>31'i3b,'l?-<3  3').  Results 
available  in  the  literatures  indicate  that  the  solutions  of  interest, 
Nash  equilibrium,  Pareto  equilibrium  and  Stackelberg  equilibrium,  for 
this  class  of  games,  and  for  different  a priori  fixed  strategy  spaces, 
is  an  affine  policy  for  each  nlaver,  provided  that  certain  existence 
conditions  are  satisfied.  T.  Basar  [B)  has  given  a counterexample  to 
show  that  a two- person  nonzero-sum  game  problem  admits  a nonlinear  Nash 
solution.  He  has  also  shown  that  it  is  possible  to  obtain  a robust 


solution  which  is  globally  unique  bv  including  an  additive  zero  mean 
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white  noise  in  the  state  dynamics.  To  present  the  idea  without  loss  of 
conceptual  generality  a two-person  stochastic  nonzero-sum  game  with 
perfect  measurement  is  considered.  Three  types  of  strategies  are 
reviewed,  the  Mash  equilibrium  strategy,  the  Stackelberg  equilibrium 
strategy  and  the  Pareto  optimal  strategy. 

2.2  Problem  Formulation 

A general  formulation  of  the  two-person  discrete-time  linear 
quadratic  stochastic  differential  game  is  given  as  follows: 


x(k+1 ) 

= Ax(k)  + 8u(k)  + Cv(k)  + w(k) 

(2.1) 

y 1 ( k ) 

= H’xCk)  + oj1  (k) 

(2.2) 

y2(k) 

= H2x(k)  + «)2(k) 

(2.3) 

where  x(k)  is  the  n-dimensional  state  vector,  u(k)  is  the  m-dimensional 
control  vector  of  player  1,  v(k)  is  the  1-dimensional  control  vector  of 
playevr  2,  yL(k)  is  the  p1-dimensional  measured  output  vector  for  the 
i-th  player.  The  vector  w(k),  cu1(k)  and  x(0)  are  independent  Gaussian 
random  vectors  for  all  k,  where  x(0)  = N(0,X(0));  w(k)  = N(0,<t>(k)); 
o/(k)  = fKO,!)1^)).  Each  player  i chooses  a control  vector  from  a set 
of  admissible  control  U1  to  minimize  the  expected  value  of  cost  function 
J1,  where 

N-1 

J1(x,u,v,k)  = xT(M)Q1(M)x(M)  + ^ [x"1  (k)Oi(k)x(k) 

k=o 


+uT(k)Ri(k)u(k)+vT(k)Si(k)v(k)]  i=1,2  (2.u) 
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Because  there  are  more  than  one  cost  functional  in  differential 
games,  optimality  is  defined  in  terms  of  the  rationality  assumed  by  the 
players  in  computing  their  controls.  The  most  commonly  known  rationale 
are  the  Nash,  Pareto  and  Stackelberg  solutions  which  are  reviewed  in  the 
following  section.  These  are  discussed  in  detail  in  C 35 ,36,47,533* 

At  each  stage  of  the  game,  each  player  will  have  access  to  some 
information  I1  about  the  present  and/or  past  value  of  the  state  vector, 
its  own  cost  function  as  well  as  those  of  the  other  players,  and  control 
strategies  of  the  other  players.  Each  player  i has  a control  strategy 
which  is  a mapping  from  the  information  set  I1  to  the  control  space  U . 

2.3  Nash  Equilibrium  Strategy 

The  Nash  equilibrium  strategy  which  is  secure  against  unilateral 
deviations  by  any  one  player,  depends  on  what  information  is  available 
to  the  players  during  the  course  of  play:  for  example,  the 
'closed-loop'  and  'open-loop'  assumptions  lead  to  entirely  different 
costs  and  controls.  It  is  important  to  indicate  that  all  the  cost 
function  mappings  are  included  in  each  information  I1.  Furthermore,  all 
players'  decisions  are  announced  simultaneously.  The  Nash  equilibrium 
strategy  is  reasonable  when  cooperation  or  coalition  can  not  be 
guaranteed  and  the  information  structure  is  as  stated  above. 

In  this  section,  we  review  the  necessary  conditions  for  obtaining 
Nash  equilibrium  strategies  for  discrete-time  dynamic  games  (2.1)  wich 
oerfect  information,  i.e.  y x ( k ) = x(k),  via  dynamic  programming. 
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At  stage  k: 

u*(k)  = arg  min  E{xT(k)01 (k)x(k)  + uT(k)R1 (k)u(k) 
u(k) 

+ v*T(k)S1(k)v*(k)  + J 1 *(k+1 )/I 1 (k) } (2.5) 

v*(k)  = arg  min  E(xT(k)02(k)x(k)  + u*T(k)R2(k)u*(k) 
v(k) 

+ vT(k)S2(k)v(k)  + J2*(k+1)/I2(k)}  (2.6) 

^ I 

When  u (k)  and  v (k)  satisfy  (2.5)  and  (2.6)  simultaneously,  a pair 

* * 

(u  (k),  v (k))  constitutes  a Wash  equilibrium  solution.  The  Wash 
optimal  strategies  for  (2.1)  are: 

u*(k)  = -^(k)x(k)  (2.7) 

v*(k)  = -A2(k)x(k)  (2.8) 

where 

A1  (k)  = CR1+K1B]_1fC1A 
A2(k)  = CS2+K2C]_1K2A 


K1  = BTP ' (k+1 )[ I-C(S2+CTP2(k+ 1 )C)_1CTP2(k+1 ) ] (2.9) 

K2  = CTP2(k+1)[l-B(R1+BTP1(k+1)3)~1BTP1(k+1)]  (2.10) 

The  optimal  oost-to-go  are 

j'*(k)  = xT(k)P1  (k)x(k)  + tt’Oc)  (2.11) 

J2!*(k)  = xT(k)P2(k)x(k)  +7r2(k)  (2.12) 


where 

P](k)  = Q1  + a’W+A^A2 

+ (A-BA1-CA2)TP1,  (k+1 ) (A-Ba'-CA2)  (2.13) 

P^W)  = Q1  (N) 

-\'k)  = tt1  (k+1 ) + tr{<J)(k)P 1 (k+1 ) } ; tt,(N)=0  (2.14) 

P2(k)  = Q2  + A1TR2A1+A2TS2A2 

+ ( A-BA1  -CA2  )T?2  ( k+ 1 ) ( A-BA1  -CA2 ) 


(2.15) 
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P2(N)  = Q2(N) 

7r2(k)  = ;r2(k+1 ) + tr{<t>(k)P2(k+1 ) } ; tt2(M)=0  (2.16) 

These  equation  can  be  solve  backwards  in  time  using  the  given  final 
conditions.  Sufficient  conditions  for  the  existence  of  the  solution 
given  by  T.  Basar  [10],  is  that  C R ^ +'< 1 B ] and  [S2+K2C]  are  non  singular. 


2.4  Pareto 


If  it  is  possible  for  all  players  in  a differential  game  to  agree, 
prior  to  the  starting  time,  to  coordinate  their  strategies,  then  the 
resulting  set  of  control  should  be  chosen  from  the  Pareto  set  of 
solutions.  No  other  feasible  choice  of  controls  could  decrease  the 
costs  incurred  by  one  or  more  players  without  increasing  the  costs 
incurred  by  the  others.  The  selection  of  a particular  solution  in  the 
Pareto  set  is  generally  made  subjectively  based  upon  negotiation  among 
the  players.  Finding  the  Pareto  set  for  a differential  game  is 
equivalent  to  solving  an  optimal  control  problem  with  a vector  cost 
function.  When  appropriate  convexity  conditions  are  satisfied  [4?, 48] 
the  problem  is  equivalent  to  solving  an  N-1  parameter  family  of  optimal 
control  problems  with  scalar  cost  criteria 

i 

J = £aiji 
i=i 

2 _ N-l 

= £ ai CxT(N)01(N)x(N)+y]xT(k)01(k)x(k) 
i= 1 " K=o 

+uT(k)R1(k)u(k)+vT(k)Si(k) v(k) ] (2.17) 
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The  components  of  a are  interpreted  as  the  relative  weights  placed 
on  the  interests  of  the  players  entering  the  agreement.  For  any  given 
weighting  vector  ot  , the  Pareto  optimal  solution  is  found  by  solving  a 
linear  quadratic  optimal  control  problem.  The  controls  corresponding  to 
this  solution  are: 

U*(«0  = -[ R+DTP(k+l )D]-1 DTP(k+1 ) Ax(k)  (2.18) 


where 


D = [B  C]  ; 


U = 


u 

v 


= V-hO1;  R = 

‘ — < l ’ 


1 = 


2 

Ta^X 

it] 

2 

E^S1 


U = 1 j 

P(k)  = Q + :<TRK  + ( A-DK)TP(k+1 ) (A-DK)  (2.191 

P(M)  = Q(N) 

A sufficient  condition  for  the  solution  to  be  exist  is  that  the  matrix 
to  be  inverted  is  positive  definite.  These  equations  can  be  solved 
backwards  in  time  using  given  final  conditions. 


The  cost-to-go  incurred  when  the  players  use  arbitrary  linear 
feedback  control  of  the  form 

u^k)  = K^xCk)  i = 1,2  (2.20) 

J1 (k ) = xT(k)Pi(k)x(k)  + -i(k) 


(2.21) 


i? 


where 

Pi(k)  = Q1  + KiTBiKi 

+ (A+BfC1+CK2)TPi(k-.!)(A+BK,+CK2)  12.22) 

Pi(N ) = Oi(N) 

rri(k)  = 7ri(k+1 ) trPi(k+1  )4>(k) ; ^i(N)=0  (2.23) 

2.5  S-fcaskelksEs  Smiiiibriua  Sfcrai^sx 

In  this  section  we  consider  two-person  games,  where  one  player  is 
called  the  leader  and  the  other  is  called  the  follower.  In  the 
Stackelberg  solution  concepts  there  is  a difference  in  information 

between  two  players.  The  leader,  who  acts  first,  knows  the  cost 

function  mapping  of  the  follower  but  the  follower  may  or  may  not  know 

the  cost  function  mapping  of  the  leader.  However,  the  follower,  who 
acts  second,  knows  the  value  of  the  first  player's  decisions  and  take 
this  into  account  in  computing  his  strategy.  Within  the  dynamic  game 
context,  three  types  of  solution  concepts  are  important  in  Stackelberg 
Sames:  open-loop,  closed-loop  and  equilibrium  solutions.  In  this 

thesis  we  consider  only  the  Stackelberg  equilibrium  strategy  which 
satisfies  the  principle  of  optimality.  For  a discrete  time  system  (2.1) 
with  perfect  information,  u(k)  represents  the  decision  of  the  leader, 
v(k)  the  decision  of  the  follower.  Using  dynamic  programming  at  stas;e 
k, 

v (u,k)  = arse  min  E{x^(k)Q2(k)>:(k)  + ul  (k)Rcvk)u(k) 
v(k) 

+ v^(k)S“(k) vtk)+Jc* (k+i )/I2(kl } 


U'.2U) 


u * ( k ) = arg  min  E { xT ( k ) Q 1 (k)x(k)  uT(k)R1(k)u(k) 

u(k) 

+ vQ ( k ) S 1 ( k ) v0 ( k ) + J * * ( k+ 1 ) /X 1 ( k ) 1 (2.25) 

v0(u,k)  is  the  follower's  optimal  reaction  to  a decision  u(k)  by  the 
leader.  The  optimal  strategies  are: 

-VT1  (k)Y(k)x(k) 


u*(k) 


vQ(u,k)  = -&(k)[Ax(k)*Bu(k)] 


where 


W(k)  = R1  + bWs’aB  + BT(I-CA)TK,(k+1)(X-Cflk)B 
Y(k)  = pWs'aA  + BT(I-CA)TK1(k+l)(l-Ca)A 
A(k)  = [S2+CTK2(k+1)C]“,CTK2(k+1) 

L(k)  a Q1  + aWs'aA  + AT(X-CA)TK1(k+l)(r-C^)A 
The  optimal.  cost  to  go  are 


jt  ( N ) = 0 

K2(k)  = 02  + ( A-BW” 1 Y )TK2 ( k+ 1 ) ( I -CA)  ( A-BW” ' Y ) 


+ yV'rVy 


k’(N)  = q’(M) 


7rc  (k)  = rr“(k+l ) tr [.<!>(  k)K*‘  (k  *•  l ) ] 


(2.26) 

(2.27) 


= xMk)K'(k)x(k)  + it  (k) 

(2, 

,28) 

= xT(k)K2(k)x(k>  + 7i 2 (k) 

(2, 

,2b) 

= l(k)  Yr(k)vr' (k)Y(k) 

(2. 

,30) 

= o'(N) 

s ir1  (k+l)  + t r[<l>(  k)K 1 (k+ 1 ) 1 

(2. 

,31) 

yu  i .»«*,/ 


12.3?) 


I') 
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3 STACKELBERG  COORDINATION 
WITH  NASH  RATIONALE  AMONG  LOWER-LEVEL  SUBSYSTEMS 


In  this  chapter  we  investigate  a sequential  decision  approach  to 

the  control  of  an  interconnection  of  several  subsystems.  Associated 

with  each  subsystem  is  a decision  maker  or  a performance  criterion 
function  or  cost  function.  A framework  for  studying  strategies  for  the 
control  of  such  systems  is  non-zero  M-person  differential  games 
[35,36, 47, *18] . Various  solution  concepts  for  defining  optimality  have 
been  proposed  and  examined.  One  of  the  most  widely  studied  solution 
concepts  is  the  Cournot  or  Nash  strategy  C *17 , ^3  3 whereby  the 

decision-makers  simultaneously  minimize  their  respective  cost  functions 
with  respect  to  their  individual  controls.  At  equilibrium  when  all  the 
decision-makers  apply  their  Nash  strategies,  the  cost  function  of  anv 
subsystem  is  at  minimum  with  resoect  to  the  control  for  that  subsystem. 

A sequential  decision  solution  concept  was  first  studied  by 

Stackelberg  [58]  in  the  context  of  a static  economic  problem  with  two 
decision-makers.  In  [16,51,52]  the  Stackelberg  concept  was  develooed 
for  two-person  dynamic  games  with  perfect  information.  Three  types  of 
Stackelberg  strategies  were  investigated  in  [16,51,52]:  open-loop, 

closed-loop,  and  feedback.  In  general,  the  open-loop  and  closed-loop 
Stackelberg  strategies  do  not  satisfy  the  principle  of  optimal itv  but 
the  feedback  strategy  and  the  more  general  eouilibr ium  strategy  [17]  are 
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defined  to  satisfy  the  principle  of  optimality.  Open-loop  Stackelberg 
strategies  were  considered  in  [53]  for  two  groups  of  players  where  the 
player  in  each  group  use  Nash  strategies  with  respect  to  each  other  but 
each  group  plays  according  to  the  open-loop  Strackelberg  concept  with 
respect  to  other  groups.  All  these  strategies  are  for  deterministic 
dynamic  games.  In  [18]  the  feedback  Stackelberg  solution  concept  is 
extended  to  stochastic  two-person  dynamic  games. 


The  approach  to  be  explicitly  developed  in  this  chapter  is  based  on 
the  coordination  solution  concept  suggested  in  [15]  for  deterministic 
systems.  We  allow  stochastic  disturbances  in  the  dynamic  process  model 
and  in  the  measurement  model,  as  in  [13],  but  several  second-level 
decision  makers  or  followers  are  presented  as  in  [15].  Several  types  of 
information  structure  are  considered.  Explicit  recursion  formulas  for 
the  design  of  the  feedback  Stackelberg  controllers  for  the  coordinator 
and  the  followers  are  presented.  The  strategies  are  adaptive  to  changes 


in  information  available  at  each  stage  and  they  satisfy  the  principle  of 
optimality.  The  strategies  of  the  second  level  decision-makers  are 
equilibrium  Nash  strategies  with  respect  to  each  other  and  in  addition, 
they  take  into  account  the  known  strategy  of  the  coordinator.  The 
coordinator  chooses  his  strategy  with  the  full  anticipation  that  the 
other  decision  makers  will  take  the  coordinator  strategy  Lnto  account  in 
mimizing  their  individual  cost  functions. 
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3.2  Problem  Formulation 


Consider  M discrete-time  linear  subsystems,  each  modeled  by 
x^k  + 1)  = A*°(k)x°(k)  + Aii(k)xi(k) 

M 

+ TT  A^(k)x^(k)  + Bi'(k)ui(k)  + (^(k)  (3.1) 

i = 1 

The  measurement  of  each  subsystem  is  given  by 
?,i(k)  = Hio(k)x°(k)  + Hii(k)xi(k) 

M 


+ X]Hij(k)xj(k)  + |1(k)  i=1 M; 

j=1 
i#  J 


(3-2) 


where  x*  is  the  n1-dimensional  state  vector  of  the  i-th  subsvstem,  u*  is 
the  m1-dimensional  local  control  vector  of  the  decision  maker  DM1  for 
the  i-th  subsvstem,  zl  is  the  l^-dimensional  measured  output  vector  for 

the  i-th  subsystem.  The  vector  xi(0);  {^(IdeR0^;  {-^(k/SR-^;  i=i M; 

are  mutually  indeoendent  Gaussian  random  vactors  for  all  k with  known 
means  and  covariences. 


E (x"(0) } 

= 0 

; Cov{xi(0)) 

= Ei(o) 

E(/)l(k)} 

= 0 

; Cov{/;i(k) ) 

= ()l(k) 

E{|i(k)} 

= 0 

; Cov{f(k)] 

= 5i(k) 

system  s 

eeks 

to  minimize  the 

expected  value  of  its  cost  function 

Ji(ui)  = | xiT(M)Oii(N)xl(N) 
M-1 


+ ^-H[xiT(k)Oii(k)xi(k)+uiT(k)Rii(k)ui(k)] 
k=o 

i=i M (3.3) 
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In  addition  to  the  M-subsystems,  we  assume  that  we  have  a 
coordinator  subsystem  modeled  by 

M 

x°(k+t)  = A°(k)x°(k)  + T Aoi(k)xi(k)  + 0°( k)  (3.^) 

itl 

and  the  measurement  of  the  coordinator  subsystem  is  given  by 

M . 

z°(k)  = H°(k)x°(k)  + y Hoi(k)x1(k)  + £°(k)  (3.5) 

i=  1 

where  x°  is  the  n°-dimensional  state  vector  of  the  coordinator 
subsystem,  u°  is  an  m°-dimensional  control  vector  chosen  by  the 
coordinator  DM0,  2°  is  the  l°-ditnensional  measured  output  vector  of  the 
coordinated-  subsystem.  (x°(0);  fl°(k)6Rn0;  £°(k)£Rl0;  k=0,...,N-l} 

are  mutually  independent  with  the  random  vector  of  each  subsystem. 
E{x°(0)}  = 0 ; Cov(x°(0) } = 2°(0) 

E(tf°(k))  s o ; Cov{ tf°(k) } = 0°(k) 

EU°(k)}  = 0 ; CovU°(k)J  = E°(k) 

The  coordinator  chooses  u°  to  minimi2e  the  expected  value  of  the  cost 
function 

M 

J°(u°)  s ■jxoT(N)0°(M)xo(N)  + 5 32  xlT(N)0ol(N)xi(M) 

H'l 
N-1 

+ CxoT(k)0°(k)xo(k)+uoT(k)Ro(k)uo(k) 
k=o 

M 

+ V'  x^ (k)O01  (k)xi(k) ] 
i="l 

where  Q°,  Q01,  R°  are  all  positive  definite. 


(3.6) 
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The  Stackelberg  approach  [153  to  the  coordination  of  the  subsystems  is 
to  consider  DM0  as  a leader  and  DM'*-  as  followers.  We  imagine  that  DM0 
provides  DM1  exact  knowledge  of  all  decisions  made  by  the  coordinator 
and  each  DM1  minimizes  J1  with  respect  to  for  each  given  decision  of 
DM0  assuming  that  the  other  subsystems  will  do  the  same.  With  this 
assumption,  the  subsystems  play  Mash  among  themselves.  The  coordinator 
then  minimizes  J°  with  respect  to  u°,  considering  that  the  decision  from 
the  subsystems  result  from  choices  of  ui  which  minimize  J*  for 
i=1,...,M.  Additionally,  the  information  sets  include  exact  knowledge 
of  the  system  dynamic  DM0,  DM1,  the  measurements  and  the  cost 
functionals.  The  statistics  of  the  random  elements  x'or  all  k are  also 
included. 

The  optimal  feedback  Stackelberg  approach  to  the  2-level 
coordination  of  the  subsystems  [15]  is  described  by  the  following 
procedure:  At  each  stage,  the  coordinator  computes  the  subsystems' 
expected  reaction  to  his  decision,  based  on  minimizing  the  subsystems' 
expected  cost-to-go  assuming  that  all  second  level  decision  makers  will 
use  their  optimal  feedback  Stackelberg  strategies  in  the  future.  The 
coordinator  then  seeks  to  minimize  his  expected  cost-to-go  assuming  that 
the  subsystems  will  respond  as  expected.  Each  subsystem  then  uses  the 
coordinator ' s decision  to  compute  his  optimal  decision,  assuming  that 
other  subsystems  will  do  the  same.  These  expectations  are  conditioned 


on  the  information  sets  available  to  each  subsystem. 
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The  information  set  consists  of  exact  knowledge  of  the  system 
dynamics,  the  measurement  rules  and  the  cost  functionals  of  all  decision 
makers.  Additionally,  it  includes  exact  knowledge  of  all  decisions  made 
by  each  player  up  to  stage  k-l  and  the  statistics  of  random  elements 
<;1(k),  ^(k),  i=0,...,M  for  all  k.  Also  the  Staekelberg  nature  of  the 
game  implies  that  the  followers'  information  contains  the  exact  value  of 
the  leader's  decision  at  time  k,  u°(k). 


Let  arg  min  f(k)  denote  the  value  of  u at  which  f(k)  achieves  its 
absolute  minimum.  Then  the  equations  that  define  these  optimal 
solutions  are  as  follows: 


u*(u°,k)  = arg  inin  E{J1(u1,x1,k)/z1(k)} 


^ 1 1 1 ^ v ^ L*  W t ^ f !✓ 


u°*(k)  = arg  min  E{J°(u°,x0,xi,k)/z0(k)} 

i 

ir 

u^*(k)  = u*(u°*  ,1<) 

The  optimal  cost-to-go  at  each  stage  are 

JiU(k)  = E(Ji(ui,x1,k)''zl(k1  ,ui  = ui*,u0-u°!t } i=1,. 

J°*(k)  = E{ J°(u0,x0,xi,k)/z°(k)  ,u°=u°*,ui=ui-*} 


i= « M 


(3.7) 


(3.6) 

(3.9) 


(3.10) 


(3.11) 


Stochastic  dynamic  programming  can  be  used  to  obtain  the  solutions. 


Two  possible  cases  will  be  considered  in  this  chapter.  First,  when 
the  information  is  centralized,  several  classes  of  information 
structures  are  discussed.  One  is  when  all  decision  makers  have  perfect 
system  state  measurement.  Another  is  when  the  information  of  all  the 


followers  are  identical  and  the  coordinator's  information  contains  the 


26 


t 


followers'  information.  Second,  we  will  constrain  each  controller  to  be 
in  decentralized  structure  and  the  i-th  subsystem  including  the 
coordinator  knows  only  its  own  measurement. 

3.3  C.Qgr.d.ina.tlg.»  U±£h  Centralized  Information 

In  general,  the  coordinator  has  some  information  from  each 
subsystem  and,  in  turn  makes  some  decisions  that  will  influence  the 
dynamic  response  of  the  lower-level  subsystems.  By  definition  of 
Stackelberg  strategies  [52],  all  decisions  made  by  the  coordinator  are 
known  to  the  second  level  decision  makers.  However,  some  information 
may  or  may  not  be  available  to  the  coordinator  and  lower- Level 

4 

subsystems.  When  the  information  sets  are  centralized,  either  the 
coordinator  and  the  lower-level  subsystems  have  perfect  information  of 
state,  or  the  lower-level  subsystems  have  the  same  measurement  but  the 
information  set  of  the  coordinator  consists  of  his  own  measurement  and 
the  lower  level  subsystems'  measurement.  Several  particular  oases  of 
this  problem  are  examined.  Let  us  examine  a system  with  one  coordinator 
and  two  second  level  decision  makers. 

Consider  the  augmented  system 
x(k+1)  = A(k)x(k)  + B°(k)u°(k) 

+ B1(k)u1(k)  + B2(lc)u2(k)  + v(k)  (3.12) 

where  x"*(k)  = [xoT(k)  x1  r(k)  x2T(k)] 

vT(k)  = [f)oT(k)  tf1T(k)  tf2T(  k)3 

x(0)  and  v(k)  are  Gaussian  random  vectors  with  zero  mean  and  covariance 
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1(0)  and  A (k ) , and  the  measurement  of  each  subsystem  is 
zi(k)  = Hi(k)x(5c)  + ^i(k)  i=0 ,1,2 

The  quadratic  cost  is 

J1 (u1 ) = jxT(N)Qi(M)x(M) 


N-1 

+ [xT(k)Qi(k)x(k) 
k=0 


+ uiT(k)Ri(k)ui(k)] 

i=0,1,2 


(3.13) 


(3.14) 


3.3.1  Perfect  Information 

Suppose  all  subsystems  have  perfect  information  of  the  states, 
i.e.,  z1(k)=x(k),  i=0,1,2.  Assume  that  the  expected  cost-to-go 
at  stage  k is 

Vi(k)  = JxT(k)Si(k)x(k)  +|yi(k),  i=0 ,1,2  (3.15) 

for  some  deterministic  matrix  SL(k)  and  scalar  function  y1(k). 

Using  dynamic  programming  as  shown  in  Appendix  1,  the  optimum  strategies 
are 

u°*(k)  = -l°(k)x(k)  (3-ib) 

u*(k)  = -<^(k)[A(k)x(k)  + B°(k)u°(k)],  i = l,2  (3-17) 

where 

L°(k)  = [R°(k)  + 'BT(k)S°(k+l)'B(k)]"18T(k)S°(k+1)A(k) 

A^k)  = (I  - Li(k)3J(k)LJ(k)Bi(k)]“1(Li(k)-Li(k)Bj(k)Lj(k)) 

i=1»2,  j= 1 > 2 , ir j 

A(k)  = A ( k ) - 3 1 ( k ) A1  ( k ) A ( k ) - B2(k)A2(k)A(k) 

B(k)  = B°(k)  - B1(k)A1(k)B°(k)  - B2(k)A2(k)B°(k) 

Li(k)  = (ni(k)  + BiT(k)Si(k+l)Bi(k) ]“1BiT(k)Si(k+l ) 


Assuming  that  ►.ho  indicated  Inverses  exist,  the  other  quantities  are 
obtained  from 

S°(k>  = Q°ik'  + iTiklS°lk+1)X(k1 

- Lol\k)C n°vk>  *■  ltrtk)S°(k*-l)iKk)  U.°u)  13.1$) 

S°lN>  = 0°lN>  (3.10) 

v°(k)  = y°ik«-D  + trS°(k+i' Uki  (3.20) 

y°lM)  = 0 (3.2D 

$Alk>  = Qltk) 

+ C A v k ) -B°v  k ) L° v k 1 3 k ) R N k '2^  t k ' ( A v k ) -B°  ( k ' L0 1 k ' 1 


SivM)  3 oHjJ' 
y^k)  - yhk+H  + trSSkt-OAtk) 
yliN)  s o is l,.' 


1=1,2 


>-Pik>L°vk'J 

1=1,2  1 3 .22) 

0.23) 

1=1.2 

V3.241 

13.25) 

in  time.  In 

summary,  we 

at  K=N-i;  ; 

S'lN),  S * V N ) 

1=1,2,  arc  given. 

1 . Compute  , 1=1,2 

2.  Compute  O^lk),  1=1.2 

3.  Compute  Aik' , Stk' 

:1 . Compute  L°\k' 

Compute  S°ik' , S^ik),  1=1,2 
o.  k — — k - 1 and  go  to  ’ . Stop  when  k=C. 

Note  that  the  control  Itws  Cot'  the  coordinator  and  the  i-th  subsvat  *m 


Involve  perfect  measurement  of  the  state. 
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Illustrative  Example 

Consider  a linear  system  described  by  the  difference  equation: 

x^k+l)  = 0. 75x^10  + 0.9x2(k)  + 0.9xg(k)  + Uj(k)  + v^Ck) 

x2(k+1)  = 0.3X](k)  + 0.8x2(k)  + O.Sx^Ck)  + u2(k)  + w2(k) 

x^(k+1)  = 0.3xj(k)  + 0.2x9(k)  + O.Qx^Ck)  + u^(k)  + w^(k) 

where  u^  are  the  controls  of  players  i;  i=1,2,3  respectively.  {w^(k); 
i=  1 ,2,3)  are  mutually  independent  Gaussian  random  vectors  with  zero 
means  and  known  covariances.  Let  the  cost  functions  be  of  the  form: 

Ji  = ^(x^D-p. )“  + fX)  tU^k)-^)2  + ui(k^  i=  1 , 2 , 3 

k=1 

where  p^;  i=1,2,3  and  y^;  i=1,2,3  are  constants.  This  problem  is 

similar  to  a tracking  problem  where  the  players  are  trying  to  force  the 
states  to  be  as  close  as  possibie  to  some  prespecified  values  while 

investing  a minimal  amount  of  energy. 

Assume  that  player  1 is  the  coordinator  or  leader  and  players  2 and 
3 are  followers.  Every  players  will  seek  control  policies  which  are 
functions  of  states.  Stackelberg  coordination  for  an  interconnected 
system  with  player  1 as  the  coordinator  and  players  2 and  3 as 

followers,  who  assume  Nash  rationale  between  themselves  is  sought.  The 

parameters  in  the  cost  functional  have  the  following  values:  p*  = 0; 

i = 1 ,2,3  and  y^  = 0;  i=i,2,3  and  N = 10.  Fig  3.1  shows  the  trajectory 


and  control  policies  of  the  system. 
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3.3.2  Coordination  with  Masted  Information  Structure 
1.  Incomplete  Information  for  Coordinator  and  Subsystem 

Consider  the  case  where  the  information  of  the  state  is  incomplete. 
At  each  stage,  in  addition  to  their  own  estimates,  the  optimal 
strategies  would  include  terms  involving  an  estimate  of  the  other 
subsystems'  estimates  of  the  state  in  the  future.  This  leads  to 
estimators  of  much  larger  dimension  than  the  system  itself.  For  s 
special  case  of  the  stochastic  problem,  consider  the  case  wnere  each 
subsystem  has  the  same  measurement 

z^k)  = z2(k)  = z(k)  = H(k)x(k)  + £(k) 
and  the  coordinator  knows  both  his  measurement  and  ail  subsystem 
measurements . So  for  any  k,  z°(k)  D z(k),  imoiying  that  the  information 
sets  ere  nested.  We  also  have  to  assume  that  there  is  no  information 
transfer  among  subsystems  through  their  controls  C 1 8 ] . The  optimum 
strategies  for  this  case  are  derived  in  ADpendix  2 as 

u^k)  = -df(k)(A(k)x(k)  + S°(k)u°(k) ) i=1 ,2 


u°*(k)  = -£?(k)Y(k)x°(k)  - A°(k)M(k) Cx(k) -2°(k) } 
j°*(k)  = 0,1 


(3.26) 

(3.27) 


• x°(k)  ■ 

T 

'SA(k)  SB(k)' 

' x°(k)  • 

,x(k)-x°(k). 

,S3T(k)  SC(k). 

, 

.x(k)-3°(k). 

+ £v°(k) 
(3.23) 

J"W(k)  = 5XT(k)Si(k)x(k)  + Jy^k)  i=1,2  (3.29) 

where  x(k)  = £{x(k)/z1(k) } , x°(k)  = E{x(k)/z°(k) } 

£f(k),  A(k),  S(k),  and  L1(k)  are  defined  in  the  perfect  information  case 
with  SA  reoiacing  3C(!<).  In  addition  we  have 

SA(k)  = Q°(k)  + Ar(k)(I-G(k))TSA(k+1)(I-G(k))A(k) 

- YT(k)A°(k)Y(k»  (3. 30) 


><■ 


38(k)  = A T(k)  ( I-G(k)  )TSB(lct*1 ) ( I-G(k)  )A  k) 

+ AT(k)(I-G(k))T(38(k+«)-SA(k^l)iJ(lc))A;k) 

- Ar(k)  (I-G(k)  )TS8(kfl ) K ( k +- 1 )H(k>l)AUc) 

- YT(kk\°(k)M(k)  (3.31) 

3C(k)  = - Mr(k)&°(k)M(fc)  + AT(k)GT(k)SAvk+1)G(k)A(k) 

+ AT(k)[I  - K(k+l)H(k+l)]TSc(k+l)[I  - K(k+DH(k+l)]A(k) 

+ AT(k)  (SB(k+i  )K(k+i  )i!(k+i  )-SB(k+i ) )G(k)A(k) 

- AT(k)GT(k)(SB(k+i)-SB(k+DK(k+i)H(k+i))A(k)  (3-32) 

Y(k)  = B(k)SA(k+l)Cl-G(k)]A(k) 

M(k)  = S(k)SA(k+l 'G(k)A(k)  * 3T(k)(S3(k>l)-SA(k*1))A(k) 

- §T ( k ) S' 8 ( k + 1 ) K ( k + 1 1 K ( k f 1) A ( k ) 

G(k)  = 8,(k)d,(k)  + 8~  ( k )<M  ( k ) 

A°(k)  = l R°(k)  + B(k)SA(k+l)8(k)]”1 

Kl(k+i ) = Pi(k+i/k)HlT(k+i)CHl(k+i)Pl(k+i/k)HiT(k+i)*-=i(k+!)  J"1 

Pi(k-fi/k)  = A(k+DPi(k/k)AT(k*D  * A ( k ) 

Pi(kfl/k+ 1 ) = [I  - Kl(k+DHi(k+i)]Pi(k+J/k) 

Pl(G/0)  = MO) 


for  i.=0,!,2  and  wher*  H^=H  for  i=i,2. 

v°(k)  = y°(k+l ) + trO°(k)P°(k/k)  + tr(X°U,+  l)CH0(k+i)P°(k+l/k)HoTik»-l) 
*•  =0(k)jiC°r(k*-1)(SA(k+i)  3CUc-»- 1 ) - 2S8(k-*i)), 

+ 2 irP° ( k-t- 1 /k )K ( k *■  i ) rf V k+ 1 ) ^ 3s i k>  I ' -Sc ( k* i ) ) 


3ilk) 


*•  trK( k+ 1 ) 1 H (k+ 1 ) P°(k+ 1 / k )Hr( k 1 ) +-(!<+■  i ) ] :<Ti k - i > 3ci k*  i ) v 3 . 33 ) 
0 1 ( k i 4.  (Aik)*8(kU0'.kiY(k)>rsLik+i)(Avk)*Sik)jPik'YvkH 
4-  uM(khUk)  *-B°ik)^0\k)Y(k)  )rHKkH^i(k)Avk)4-B0;k)^\k)Y«.k)) 


1=1,2 


i 3.31') 


/(k)  = yl(ki-0  + trOAik)P(k/k)  + trS1^k«-i)5C(k+n 

+ trSi(k+l  )K(k+i  )CH(k+i>P(k+»/k>HT(k4.J)+=(k+D]KT(k+0 
+ trC?\k/k)  - P0(k/k)](M(k)-Y(k))T^oT(k)(8oTlk)R1(k)8°(k) 
f tT(k)S1(k+D§(k))^°(k)(M(k)-Y(k))  (3-35) 

The  recursive  equations  (3*30)  and  (3.3**)  are  identical  to  equations 
(3.18)  ana  (3. 2.2)  in  the  perfect  information  case,  with  the  same  initial 
conditions,  so  that  the  solution  S^(k)  and  S^(k)  in  (3-30)  and  (3-3*0 
ore  -=oual  to  3°(k)  and  S^(k)  in  (308)  and  (3.22).  Thus,  as  far  as  the 
followers  are  concerned,  they  play  a "separation  principle"  strategy 
which  consists  of  the  optimal  deterministic  feedback  law  of  their  best 
estimate  of  the  state.  The  leader  strategy  includes  his  own  estimate 
and  a term  involving  a difference  in  estimates.  When  both  estimates  are 
the  same,  the  leader  also  plays  as  in  the  "separation  principle". 


Perfect  Information  for  Coordinator 


Consider  the  problem  in  which  the  coordinator  has  perfect  state 
measurement  while  the  Lower  level  subsystems  have  available  only  noisy 
output  measurements.  In  addition,  we  assume  that  conditions  are  such 
that  the  coordinator  can  deduce  exactly  the  lower  level  subsystems' 
state  estimators,  and  the  Lower  level  subsystems  have  the  same  noisv 
measurement,  L.e.,  z (k)  = t“(k^  = z(k). 


When  the  coordinator  has  o^rfeet  state  measurement  and  can  deduce 
exactly  the  state  of  the  lower  level  subsystems'  state  estimator,  i.e., 
H°(k);I  and  c°(k)=C.  also  s°(k>  1 zik'. 


The  ore  clem  is  of 


nested 


information"  type  except  the  coordinator  does  not  have  to  estimate  its 
own  state  (£(x(k)/z°(k)]=x(k) ) . 

The  control  law  of  the  coordinator  is 

u°*(k)  = -a°(k)Y(k)x(k)  - &°(k)M(k) ('x(k)-x(k) ) (3.36) 

and  the  control  laws  of  the  lower  level  subsystems  are 

uf,(k)  = -^(k)CA(k)x(k)  + B°(k)u°(k)  3 i=1,2  (3.37) 

where  S(x(k)/z(k) 3=x(k) . The  optimum  cost-to-gc  is 


x(k) 

T 

SA(k)  SR(k)' 

x ( k ) 

_x(k)-x(k)_ 

_S3T(k)  SC  ( k )_ 

_x(k)-x(k)_ 

(3.33) 

Ji\k)  = ]xT(k)Si(k)x(k)  + i=1 ,2  (3-39) 

where  all  matrices  are  the  same  as  in  Section  3 .2.1. 


3.  Mo  Measurements  for  Subsystems 


Consider  the  problem  ir  whicn  the  coordinator  has  a noisy 
measurement,  while  the  lower  level  subsystems  have  no  measurement 
available  to  them  and  are  restricted  to  usin?  only  a priori  information. 


When  the  lower  level  subsystems  have  no  measurements,  i.e.,  H^(k)*0 
(null  matrix)  and  :^(k)3Z1(0)  for  ail  k.  the  orobiem  is  also  of  nested 
information  type.  The  control  law  of  the  coordinator  is 

u°(k)  = -ti°(k)Yik) x°(k>  - ^°(,k)M(k)(x(k)-x°(k))  (3.40) 
and  the  control  laws  of  the  lower  level  subsystems  is 


ujik)  = Avk)x(k)  ■*  S°(k'u°;k'j,  i = ?.C 

where  Z(xtk).  z°ik)3  = x°lk),  SCxik)'z(k)]  = 'x(k). 


V 3 . u n 
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The  optimum  cost-to-go  is 


J°’(k) 


’ 3°<k)  ■ 

T 

SA(k) 

SB(k)’ 

• x°(k)  ’ 

.x(k)-x°(k). 

,SBT(k) 

3C(k). 

.x(k)-x°(k). 

(3.42) 

J1*^)  = ^xT(k)Si(k)x(k)  + ^(k)  i=1,2  (3.43) 

where  all  matrices  are  the  same  as  in  Section  3.2.1. 

Substitution  of  (3.40)  and  (3.41)  into  the  system  equation  gives 


x(k+1 ) = A(k)x(k)  - O^kjA^kU.Ox)  - B2(k)A2(k)A(k)“x(k) 

- (B°(k)  - B1(k)A1(k)B°(k)  - 32(k)A2(k)80(k))A°(k)'f(k)x(k) 

- (3°(k)  - B1(k)A1(k)3°(k)  - 32(k)A2(k)8°(k) )A°(k)Y(k) 


(x(k)-x°(k)) 


(3.44) 


It  follows  that  the  optimal  estimate  of  the  states  by  the  lower 
level  subsystems,  given  only  a priori  information,  i.e.,  no  output 
measurement,  is  given  by 

x(k+1 ) = [ A ( k ) -3 1 ( k ) A1  ( k ) A ( k ) -B2 ( k ) A2  ( k ) A ( k ) 

- ( B° ( k ) -3 1 ( k ) A1  ( k ) 8° ( k ) -B2 ( k )A2 ( k ) 8° ( k ) )A° ( k ) Y( k ) ]£(k ) (3.u5) 

with  initial  condition  "x(0/0  )=x(0) . 

In  addition,  when  x(0)=0,  then  x(k/k)=0  so  that 
u°*(k)  = -A°(k)(Y(k)  - M(k) ]x°(k)  (3.46) 

and  uj(k)  = -Ai(k)3°(k)u°(k) , i= 1 ,2  (3-u7) 


3.4  Constrained  Decentralised  Structure 

It  may  be  desirable  to  have  a control  policy  that  is  simpler  to 
implement  than  the  optimal  policy.  Satisfactory  concrol  of  a high-order 
linear  system  may  often  be  achieved  using  relatively  fewer  system 
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measurements  and  a controller  of  low  order.  This  has  been  the 
motivation  for  a number  of  optimal  designs,  using  output  feedback  or 
dynamic  controllers  of  a specified  order.  For  recent  work  in  this  field 
we  refer  the  reader  to  ( 31 , 32 ,34  , 37 , 33 ,5*1  ] . 

3.4.1  Decentralized  Control  with  Instantaneous  Output  Feedback 

Consider  the  stochastic  problem  where  a restriction  is  placed  on 
the  control  of  the  t-th  subsystem  and  the  coordinator  at  any  instant  to 

be  a linear  transformation  of  the  measurement  at  that  instant.  Also, 

there  is  no  information  transfer  among  subsystems  through  their 
controls.  This  simplifies  the  problem  since  a filter  is  no  longer  used 
to  estimate  the  state.  Then 

u1 ( k ) = Fi(k)zi(k),  i =0 ,1,2,  k=0 , 1 , . . . ,N-1  (3-48) 

where  F^(k)  is  to  be  determined  to  minimize  the  expected  value  of 
J^u1) . 

Consider  the  augmented  system  (3.12)  and  the  measurement 

zi(k)  = Hi(k)x(k)  + ^(k),  i=0 ,1,2  (3.49) 

Then 

ui(k)  = Fi(k)Hi(k)x(k)  + Fi(k)fi(k),  . i=0,l,2  (3-50) 

and 

2 

x(k+1 ) = ( A ( k ) + f)  Bi(k)Fi(k)Ki(k)):<(k) 
i=o 

2 

V 9i(k)Fi(k)fI(k)  + vk 
i=o 


+ 


( 3 • 5 1 ) 
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Define  P ( k ) = E{x(k)xT(k)}  and  note  that  x(k)  depends  on  ^(k)  for 
i=0,1,...k-1  only,  implying  that  E{x(k) v^(k) } =0 . Then  the  recursive 
equation  for  P(k)  is 


P(k+1)  = [A(k)+£  Bi(«c)Fi(k)Hi(k)]P(k)[A(k)  + £ 3i(k)Fi(k)Hi(k) ]T 
i=o  i=o 

2 ...  . 

+ Yj  B1(k)Fi(k)HI(K)F:-T(k)BlT(k)+A(k)  (3.52) 

i=o 


Lemma  3 . 1 If  the  linear  system  described  by  (3*12)  is  controlled  using  a 
linear  control  policy  ( 3 - ) , i = 1 , 2 then  the  expected  cosc  (3. 1*0  i = 1 , 2 
can  be  expressed  as 


E[ J1(k) ] = jE( xT(k)S1(k) x(k) ] + p E trS1( 1)A( 1-1 ) 

l=k+1 

M . . 

+ i T { trFlT(  1-1 )[ R1( 1-1  )+3lT( 1-1 )S1( 1)B1( 1-1 ) ] Fi( 1-1 )HX( 1-1 ) 
l=k+1 

+ E tr*F^T(  1-1  )5jT(i-1)Si(l)Bj(l-1)Fj(i-i)Ej(i-1 )} 
j=o 

it  3 i=1,2  (3.53) 

where  S1(k)  = Q1(k)+HlT(k)Fx7(k)Ri(k)Fi(k)Hi(k) 

2 . . _ . 2 . 

+[A(k)  + Y'  3J(K)FJ(k)HJ(k)]TS1(k+1)[A(k)+y'  3J(k)FJ(k)HJ(k) j 
j=o  j =0 

i=  1 , 2 (3.54) 

Si(N)  = Qi(N)  (3.55) 

Proof  The  proof  is  by  induction. 


Consider  the  augmented  system  (3.12)  and  the  cost  criterion  (3.1U). 


The  assumption  obviously  holds  for  k=N.  For  any  k 
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N-t 

EtAk)]  = ExT(l)01a)xa)^uiT(l)R1a)u1(l)]}+^S{xTU'l)Q1(N).'c(l'0} 

l=k 

= E[Ji(k+l)J  + E^lxT(k)Qi(k)x(k)+uiT(k)Ri(k)ui(k) } 

i=  1 ,2  (3.56) 

with  k=k+l  using  (3.53)  in  (3.56)  and  after  some  algebra  the  assumption 
holds  for  k=k+1 . Thus  (3*53)  holds  for  k=0, 1 , . . , f n.  The  necessary 
condition  for  a minimum  at  each  step  is  that  the  derivative  of  the 
remaining  cost  with  respect  to  F*(k);  is  1,2  must  equal  zero. 

F 1 * ( k ) = -CR1+B1TS,(k+1)B1]“1S1TS,(k+1)CA+B0F0(k)K°^B2F2(k)H2] 
P(k)H1T[H1?(k)H,'r-H,]“'  (3.57) 


or 


(k)  = -CR2+B2TS2(k+1)32]"132TS2(k+1)CA+3°F0(k)H°+S1F,(k)H1] 


where 


P (k) H2t[ H2P (k ) H2T+=2 ] " 1 (3.53) 

Fr>(k)  = r,(k)CA(k)+B°(k)F°(k)H0(k)]T'(k)  (3.59) 

F2*(k)  = I’ 2 ( k 1 [ A ( k ) + 3 0 ( k ) F°  ( k ) H 0 ( k ) 3 T2  ( k ) (3.60) 

ri(k)  = Cl-Mi(k)sj(k)M3(k)Si(k)]~,iMi(k)+Mi(k)33(k)MJ(k)] 

i=  1 j 2 , j=  1 , 2 , i*j 

Ti(k)  = CTl(k)+YJ(k)Hj(k)Yi1k)]Cl-Hi(k)Y-(k)H''(k)Yi(k)]"1 

i = 1,2,  j=  1 ,2 , ir j 

Mi(k)  = -CSi(k)+BiT(k)S1(k+i)B1(k)3",B1T(k)Si(k^n  1=1,2 


Yi(.k)  = Pik)HiT(k)iHi(k)?(k)HiTrK)+Hi(k)]~1 


i=1,2 


Lemma  R.2  If  a linear  system  described  by  \3.*2)  is  controlled  using  a 
Linear  control  policv  1 3 - ) for  i=0  then  the  exoected  cost  for 
i=0  is  sxoressed  as 
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N 

E[ J°(k) ] = jE[xT(k)S°(k)x(k)]  - 2C  E trS°(i)A(i-1) 

i=k+l 

+ trFoT( i-1 )[R°( i-l )+BoT( 1 — 1 ) S° ( i ) 8° (1—1 ) ] F° ( i — 1 )E°( l-l ) 

2 

+ V trFJ*T(i-1)BjT(i-1)S0(i)BJ'(i-1)FJ*(i-1)HJ(i-1)] 

(3.6D 

where 

S°(k)  = Q°(k)  + H0T(k)F0Tvk)B0(k)F0(k)K0(k) 


+ [ A(k)+B°(k)F°(k)H°(k)  + T'  31(k)F1*(k)Hl(SC) ]TS°(k+1 ) 

itl 
2 . 

[A(k)+80(k)F0(k)K°(k)+ V Bi(k)Fi  (k)H“(k) j (3.62) 

il'l 

S°(M ) = 0° ( N ) (3.63) 

At  each  seep  the  necessary  condition  for  ?.  minimum  is  that  the 
derivative  of  the  remaining  cost  with  respect  to  each  element  of  F°(k) 
must  eaual  aero. 

F°(k)  = -( R°(k)*(3°(k)+31 (k) f1 (k)3°(k)+32(k)r2(k)3°(k) )7S°(k+1 ) 
(B°(k)+81(k)|',(k)30(k)+S2(;<)|-2(k)3°(k))]-1 
{(3°(k)+31  (k)  I’1  (k)8°(k)+52lk)r2(k)3°(k)  )TS°(k+l ) 

[ A ( k ) +8  ' ( k ) I -1  ( k ) A ( k ) T 1 : k ) H 1 ( k ) +B2  ( k ) I'2  ( k ) A ( k ) T2  ( < ) H2  ( k ) ] 
P(k)[H0(k)+H0(k)T1(k)H,(k)+H°(k)T2(k)H2(k)jT 
+C  B 1 ( k ) r1  ( k ) 3°  ( k ) +32  ( k )r2  ( k ) 3°  ( k ) ] TS°  ( !<-*•  1 ) 
CB1(k)r1(k)A(k)T1(k)H5(k)T1T(k)Ho7(k) 
+32(k)r2U)A(k)T2(k)H2vk)T2T(klH0?(k)1J  } 
CH°(k)?ik)HoT(k)+(H0(k)T\k>H,(kt^H0(k)T2(k)K2Cki) 

P(k) (H°(k)-H0(k)T,(k)H\k)+H0-;k)T2(k)H2(k)  )r 
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+(=0(k)+H0(k)T1(k)=,(k)T1T(k)HoT(k) 
+H°(k)T2(k)H2(k)T2T(k)HoTik))3-1  (3. 6“) 

Theorem  1 , 1 The  sequences  (F^(k)}  i=0,l,2;  k=0 , 1 , . . . , N- 1 of  the  i-th 
subsystem  that  minimizes  EfJ^tu^)}  i =0 ,1,2  subject  to  the  constraint 
(3.48)  are  siven  by  the  equations  (3-59),  ( 3 - 60 ) and  (3. 34)  where  it  is 
assumed  that  the  required  inversed  exist  and 


2 2 

1.  p(k+l)  = U(k)  + y 3i(k)Fi(k)Hi(k)]?(k)[A+y]  3i(k)Fi(k)Hi(k)]? 
i^o  i=o 

2 

+ ^ 31(k)Fi(k)Ei'(k)FiT(k)SiT(k)  h\(k)  (3-55) 

iso 


P ( 0 ) is  given. 

2.  Si(k)  s Qi(k)*HiT(k)FiT(k)Ri(k)Fi(k)Hi(k) 


C A ( k ) -*-  3]  3i(k)F:i"(k)Hi(k)3T3i(k+l )[  A(k)+  ^ 3“(k)Fi(k)H1(k)  J 
iso  i=o 

i=i.2  (3*36) 


SiU'l)  = QiCN ) i=  1 ,2 

3.  3°(k)  s Q°(k)  + HoT ( k ) FoT ( k ) R° ( k ) F° ( k ) H° ( k ) 


(3.67) 


+ C A(k)+8°(k)F°(k)H°(k)  v V Si(k'Fi<,(k)Ki(k)  3TS°{k+» ' 

is  1 


CA(k)+B0(k)F°(k)n°(k)+  V]2i(k)Fi\k)H1(k)J 

i=  l 


S°( M)  = Q°(M) 


(3.63) 

(3.69) 


The 
and  the 
boundary 
recursive 


sequence  (F^ik)},  i =0 , * , 2 ; ksO , 1,..., M- t 
i-th  subsystem  are  the  solution  to  the 


of  the 
discret 


value  oroolem.  Mote  that  (3.65).  *,3.66)  and 

relationships  for  generating  ?tk)  and  $kik).  i 


coordinator 
e two- point 
\l.o7)  are 
,1,2  excect 
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(3.65)  which  is  a forward  equation  and  (3.6 6)  and  (3-67)  which  are 
backward  equations,  and  all  depend  on  the  sequence  {Fx(k)}  or  ( S “ ( k ) } 
and  { ? ( k ) } are  known  no  simple  calculation  will  solve  the  problem.  We 
suggest  the  following  simple  procedure  to  solve  the  equations: 

1.  Make  an  initial  guess  for  the  gain  (F?(k)}  and  { F ^ ( k ) } i = 1 , 2 ; 

3 o 

k=0 , 1 , . . . , N— 1 . Let  j=0. 

2.  Use  {F?(k)l  and  (Fj'(k)}  to  solve  (3.65)  forward  in  time  to  determine 
{P j(k) } with  ?j(0)=  2(0). 

3.  Use  (F?(k)}  and  (F^(k)}  to  solve  (3.66)  and  (3-68)  backward  in  time 
to  determine  { S j-  ( k ) } , i=1,2  and  { S?  ( k ) } with  S ^ ( i'! ) = 0 ~ ( N ) , i=0,1,2 

4.  Use  £ P j (k ) } and  (S?(k)}  in  (3.64)  to  determine  (F°+1(k)}. 

5.  Use  £ P j ( k) > , (sf(k) } , i=l,2  and  {Fj°+1(k)}  in  13.59)  and  (3-60)  to 
determine  (Fj+1(k)},  i=1,2.  Let  j=j+1. 

6.  Repeat  (2)-(5)  until  the  desired  degree  of  convergence  is  reached. 

So  far  no  convergence  conditions  for  this  algorithm  have  been 
found,  but  as  with  most  algorithms  of  this  type  it  is  expected  that 
convergence  depends  on  the  initial  guess. 

3.4.2  Decentralized  control  with  dynamic  output  feedback 


Consider  the  stochastic  problem  where  a dynamic  controller  of  a 
specified  order  for  the  i-th  subsystem  and  the  coordinator  described  by 
w1(k*1)  = Di(k)wi(k)+M1(k)z1(k)  i =0 ,1,2  (3-70) 

where  w1^  RSi  is  the  state  vector  of  the  controllers  used,  then 

u^(k)  = M1(k)w1(lc)+F^(k)z1(k)  1=0, 1,2  (3.71) 
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also 

3i(k)  = Hi(k)x(k)  + /(k)  1=0, 1,2  (3-72) 

For  a given  integer  (O^s^n)  find  matrices  N*(k),  F ^ ( k ) , Dx(k)  and 
M1(k)  such  chat  the  corresponding  expected  cost  E{J1(u1)}  will  be 
minimum.  Note  that  if  s1:  0 the  controller  is  reduced  to 
u1 ( k ) = ?i(k)zi(k)  i=0 ,1,2 

and  if  s1  = n,  an  optimal  solution  is  obtained.  The  co3t  functional  to 
be  consider  is  the  same  as  in  Section  3.4. 1. 


Consider  the  augmented  state  vector 

xT(k)  = [ xT(k)  woT(k)  w1T(k)  w~* (k)  ] 


then 


where 


and 


x(k+t)  = (A(k)+J]3i(k)?i(k)Hi(k))?(k) 
i=o 

2 

+ J2  31 ( k /F1 ( k ) I: 1 ( k ) f 1 (k ) +1 v (k ) 
i=o 


F~  ( k ) 


Fi(k)  Ni(k)  "• 
L M~(k)  Di(k)  ] 


i 

l 0 


I" 

I 2 

i=o 


(3.73) 


urfk)  = TF1(k'Ht(k)  x'(k)+T?^'(k)I1'^  '(k)  i=0,1,2  (3-74) 

where  T = C II  0 ] 

n 2>i 
i=o 

T 

Let  ?(k)  = E[x(k)x  ( k ) ] and  che  cost  functional  of  the  coordinator  is 


M-l 

J°  = |x~(i\')Q0(N)x(N)  5^]  Cxr(k)Q°^k)x(k)+u°^vlc'R3^k)u0(k' J <3.75) 

~k=o 

Also,  the  cost  functional  of  Che  lower-level  subsystem  is 

N-1  „ , 

J*  - ^x^(N)Q(M)x(M)  + 4 ^ [x~(k)0*(k) x(k)+Crl  (k)R1(k)uI(k)  ] (3.7b) 

"k=o 

The  augmented  system  (3.73)  and  controller  (3.74)  are  of  the  same  form 
as  (3-5  0 and  (3.49).  Also  the  cost  functionals  are  the  same.  The 
following  theorem  can  be  derived  using  the  same  argument  as  Theorem  3.1. 

Theorem  ^ .2  The  sequences  (?i(k)}  i=0,l,2;  k=0, 1 , . . . , M- 1 of  tne 
coordinator  and  the  i-th  suosystem  that  minimise  SU'Hu1)}  i=C,l,2 
subject  to  the  constraint  <3. 44)  are  given  bv 

?°(k)  = -CR0+(§0+%V,§0+B2r230;T§0(k*l)(30*11r!30»BT2§°)3“1 

{ ( B°-*-§ 1 r1  B°+33r2 S° ) TS°  ( k+  i ) C A+3 1 T ‘ AT 5 rf 
+B2r2  AT2H2  ] P ( K°-*-H°T 1 H ! +H°T2H2  ] T 

+C  b 5 r1 s°+s2r2s° ] Ts° ( k+ o c b 1 r 1 at t 1 :hoT 
42r2ArWTH°rj) 

[ H°?HoT+  ( ricT 1 H 5 +H°T2n 2 ) ? v r.°4cT  ’ H ! *H°T2H2 ' 

+(s,0+h0t1z,t'^h0^+h0t2z2t2,2h0^  ) 3~  ’ 

?i*(k)  = {•1(k)[A(k)+B0('tc)?°ik)K0<k)jT1(k'  i=!,2 


(3.7**' 

' -s  *»0  ) 

o . > 0 , 


where 


< . ; i _ . •>  ,•  j < 

- - * I — 1 ^ - » — » - T w 


ri  - [ y^Y-'H-'Y’*  ] f I-H  "Y-H''Yi  3“  ! i=t.2,  j=!,2,  i?j 

M1  = -C  R--«-3irSi  i k-*-  ) 3 * ; ~ 1 3 Vr!)  iz),2 

Y1  = ?lk)H-lLsfcP\k'ti“-Sfc]“  i- ' , 2 
It  is  assumed  tnat  the  reouired  inverse  matrices  exist  ir.d  -nere 
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1.  ?(k+o  = [A+f;3ir:(k)?r:]p(k)[A+^Bir:(k)Hi]T 

i=o  i=o 

+f;B1?ic<}Eir1T(k)BiT+.\(k) 

i=o 

?(0)  is  given. 

2.  S^k)  = Qi+  KiT?iT(k)R1Fi(k)Hi 

2 . 2 . 

+ CA+  J3BiPC<)H1]TS1(k+l)CA+^51Pi(k)Hi] 
i=o  i=o 

S1  ( N ) = Q^M) 


(3.79) 


(3.80) 


3.  S°(k)  = Q°  + h oT?° 1 ( k ) R °F° ( k ) H°  + [ A+B°?0(k)H°+2]  3iFi*(k)Hi3T 

is  i 

2 • 

S°(k+1 ) ( A+3°F°(k)H°+^  S1?1  (k)H1]  (3. Si) 

isl 

S°(M)  = Q°(N) 

Again  the  sequence  (Fx(k)}  i=0,i.2;  k=0 , 1 , . . . , M— 1 of  the  coordinator 

and  the  i-th  subsystem  are  the  solution  to  the  discrete  two-point 
boundary  value  problem  as  the  previous  one  but  are  more  complicated  to 
solve . 


In  the 

case  where  either 

the  coordi 

r.atcr 

has 

noise 

measurement 

or 

the  lower-level 

subs'  stems 

have 

no  noise  i 

measurement , 

and 

want  to  use  outpu 

t feedback, 

they 

can  do 

so  by  i 

the  dimension 

of 

their  controller 

to  zero. 
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3.5  Conclusions 

The  control  of  an  interconnected  set  of  linear  discrete  tine 
stochastic  systems  has  been  considered.  The  organisational  form  of  the 
system  permits  one  decision  maker  to  be  the  coordinator  or  leader  and 
the  decision  makers  for  the  other  subsystems  are  all  followers  with 
respect  to  the  coordinator,  but  they  use  the  Mash  strategy  with  respect 
to  other  second  level  decision  makers.  Both  centralized  and 
decentralized  control  structure  were  considered.  As  in  single  decision 
maker  control  problems  with  output  feedback  constraints, 
decentralization  constraints  generally  lead  to  two-point  boundary  value 
problems.  Explicit  recursive  formulas  for  these  two-point  boundary 
value  problems  have  been  derived.  The  sequential  decision  aoproach 
seems  to  be  a natural  one  when  the  cost  function  associated  with  one 
decision  maker  has  a more  global  significance  compared  to  the  others. 
This  decision  maker  takes  the  role  of  a coordinator  and  leader. 


4.  STACXEL8ERG  COORDIMAV IOM 
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WITH  PARETO  RATIONALE  AMONG  LOWER-LEVEL  SUBSYSTEMS 


4.'  IljSmliLCiLjUHl 


In  the  previous  chapter,  a sequential  decision  approach  to  the 
controL  of  an  interconnected  system,  where  the  Lower-Level  subsystems 
choose  to  pLay  Nash  rationale  among  themselves,  has  been  obtained.  An 
extension  of  this  sequential  decision,  is  to  consider  the  probLem  when 
the  lower-level  subsystems  choose  to  play  Pareto  optimal.  It  is 
possible  that  the  lower-level  subsystems  desire  to  cooperate  within 
their  group.  Then  the  resulting  set  of  controls  should  be  chosen  from 
the  Pareto  optimal  set  of  solutions.  In  this  chapter,  we  will 
investigate  the  Stacks! berg  coordination  of  a discrete  linear  quadratic 


Gaussian  problcu, 
themselves.  Several 
main  ideas  in  this 
avoid  repetition  of 
presented . 


when  the  Lower-level  subsystems  cooperate  among 
types  of  information  structure  ar*»  oens! . "he 
chapter  are  basically  derived  from  Chapter  ?.  To 
idratloal  arguments,  -a  more  compact  treatment  is 


X.  -V  .***—*.'  1’. 


is 


Consider  t.v?  M discrete  - ime 
, '•he  Stack*?  Lberg  approach  ”'S] 
to  consider  ?M°  aa  a loader  and 


linear  system  described  in  Section 
to  hhe  eoordinnt. on  of  the  subsystems 
as  followers.  croviies  PM" 


exact  knowledge  o; 
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minimizes  with  respect  to  t'or  each  given  decision  ot'  DM0,  assuming 
that  ail  the  followers  agree  on  an  cooperation.  With  this  assumption 
the  subsystems  use  Pareto  optimal  strategies  among  themselves.  The 
coordinator  then  minimizes  J°  with  respect  to  u°,  considering  that  the 
decisions  from  the  subsystems  result  from  choices  of  u*-  which  minimize 
for  i=1,...,M.  Additionally,  the  information  sets  include  exact 
knowledge  of  the  system  dynamic  DM0,  DM^,  the  measurements  -and  the  cost 
functionals.  The  statistics  of  the  random  elements  for  all  k are  also 
Included.  Consider  the  augmented  system 

x(k+l)  = Ax(k)  + 5°u°(k)  + Bulk)  + v(k)  (4.0 

where 

xr(k)  = C xoT(k)  x!T(k)  x2r(tO  ] 
vT(k)  = C t?oT(k)  tf,T(k)  Ar(k)  ) 
uT(k)  = [ u1T(k)  u^\k)  ] 

x(0)  and  v(k)  are  Gaussian  random  vectot'3  with  zero  means  and  covariance 
l(Q)  and  A(k).  The  measurement  equation  of  each  subsystem  is  given  by 
zi(k)  = Hi(k)x(k)  + r(U)  i=0 , V ,2  (4.2) 

The  quadratic  cost  of  the  Lower-level  subsystems  becomes 
2 2 

J(u)  = ^AAuM;  Ai 0;  7^  A = 1 (4.3) 

i=  t is  1 


N- 1 

s 4^"(N)Q(M)x(M)  + 45H  Cx'l\k)Q(k)\"(k)+uOk)R(k)u(k) ] (a, 4) 

k=o 

Also,  the  quadratic  cost  of  the  coordinator  becomes 
J°(u°)  = 4xT(M)Q°(M)x(M) 

N - 1 

•-  CxrUc)Q°UOx(k)  - uoT(k)R°(k)u°ikO 

“kso 


v a . « ) 
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The  equations  that  define  the  optimal  solutions  are  as  follows: 

u0(u°,k)  = arg  min  E{ J(u, x,k)/z(k) } 
u 

u°*(k)  = arg  min  E{ J°(u0,x,k)/z°(k) } 

u° 

u*(k)  = u0(u°*,k) 

The  optimal  cost-to-go  at  each  stage  are 

J*(k)  = E{ J(u, x,k)/z(k) ,u  = u*,u°  = u0*} 

J°*(k)  = E{ J0(u°,x,k)/z°(k) ,u°  = u°*,u  = u*} 

Centralised  and  decentralised  structure  of  information  are  investigated 
in  the  following  section. 


u.3  Coordination  with  Centralised  Information 

In  this  section,  two  cases  of  centralised  information  is 
considered,  perfect  information  and  nested  information.  Recursive 
equations  for  the  design  of  feedback  controllers  for  the  coordination 
and  the  followers  are  obtained.  For  simplicity,  a system  with  one 
coordinator  and  two  second-level  decision  makers  is  examined. 

4.3.1  Perfect  Information 


Suppose  all  subsystems  have  perfect  information  of  the  state  i.e. 
zi(k)=x(k).  Assume  that  the  expected  ocst-to-go  of  the  lower  level  at 
stage  k is 


V(k)  = ^xT(k)S(k)x(k)  * 4" 
for  some  deterministic  matrix  S(k) 


jS(k) 

and  function  /3(k) . 


(u.6) 


Also  the  expected 


cost-to-go  of  the  coordinator  at  stage  k is: 
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V°(  k)  = £xT(k)S°(k)x(k)  + ^3°(k)  (4.7) 

for  some  deterministic  matrix  S°(k)  and  -function  j3°( k).  The  optimal 
strategies  are  derived  using  dynamic  programming: 

u°*(k)  = -£?(k)BT(k)S°(k+1 )A(k)x(k)  (4.8) 

u0(k)  = -A(k)3T(k)S(k+1 )[ A(k)x(k)  + 3°(k)u°(k)]  (4.9) 

where 

A(k)  = C R(k)  + 3T ( k ) S ( k+ ! ) 3 ( k ) ] ” 1 
A°(k)  = (R°(k)  + 3T(k)S°(k4-1)3(k)]"1 
A(k)  = [A(k)-3(k)A(k)BT(k)S(k+l ) A ( k ) ] 

5(k)  = CS°(k)-3(k)A(k)3T(k)S(k+1)B°(k)] 

Assume  all  the  required  inverse  matrices  exist  and 
S°(k)  = Q°(k)  + AT(k)S°(k+1)A(k) 


+ AT(k)S0(k+1)B(k)A°(k)BT(k)S°(k+:)A(k)  (4.10) 

S°(N)  = Q°(N)  (4.11) 

/3°(k)  = j9°(k+1 ) + trS°(k+1  ).\(k)  (4.12) 

S°(M)  =0  (4.13) 

S(k)  = 0(k)  + MT(k)R(k)M(k) 

+ [A(k)-B(k)A°(kJ/8T(k)S°(k4.1)A(k)]T 

S(k+1)(A(k)-B(k)A°(k)BT(k)S°(k+1)A(k)]  (4.14) 

3(M)  = Q(N)  (4.15) 

M(k)  = A(k)BT(k)3(k+l)CA(k)-30(k)A°(k)'§r(k)S°(k4-1)A(k)3 

J3(k)  = /3(k4-D  + trS(!c+OA(k)  (u.l6) 

jB(N)  = 0 (4.17) 


These  equations  can  be  solved  backwards  in  time  with  the  given  final 
conditions.  The  condition  for  tne  existence  of  the  solutions  is  that 


the  matrices  to  be  inverted  are  nonsingular . 
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4.3.2  Coordination  With  Nested  Information  Structure 

Consider  the  case  when  the  information  of  the  states  is  incomplete 
but  the  lower-level  subsystems  know  the  same  measurement  i.e.  z^k)  = 

p 

2 (k)  = 2(k)  = H(k)x(k)  + f(k)  and  the  coordinator  knows  both  his 
measurement  and  all  the  subsystesis  measurement,  2°(k)  3 t(k).  Assume  no 
information  transfer  among  subsystems  through  their  controls.  The 
optimal  strategies  are  derived  using  dynamic  programming: 

uQ(k)  = -&(k)8T(k)S(k+l )C A(k)x(k)  + B°(k)u°(k)]  (4.18) 

u°*(k)  = -/f(k)Y(k)x°(k)-£?(k)M(k)Cx(k)-x°(k)}  (4.19) 

where  x(k)  = E[ x(k)/z(k) ] , x°(k)  = Etx(k)/z°(k)] , A(k)  and  A°(k)  are 
defined  as  in  Section  4.3.1  and 

Y(k)  = BoT(I-SG)TSA(k+1)(I-8G)A 

M(k)  = BoT(I-BG)TCSS(k+1)-SA(k+J)]A  - Y(k) 

- BoT(I-BG)TSB(k+1)K(k+1)HA 
G(k)  = &(k)BTS(k+1) 


The  optimal  cost-to-go  at  each  stage  for  the  coordinator  and  the 
lower-level  subsystems  are: 


J°*(k) 


X. 

0 

1  

T 

"sA  ( k ) 

S3(k)' 

~X°(k) 

|x(k)-x°(k) 

SBT(k) 

u 

SC(k) 

mi 

x(k)-x°(k) 

7 tfM 


(4.20) 


J*i(k)  = ^xT(k)S(k)  x(k)  + ^/3(k) 


(4.20 
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where 

SA(k)  = Q°  + A%-BG)TSA(k+1)(I-BG)A-YTA°Y  (4.22) 

SB(k)  = AT(I-BG)TSA(k+1)(I-BG)A  + AT(I-BG) [S3(k+1  )-SA(k+1 ) ] A 

- AT(l-BG)TS8(k*1)K(k+1)HA-Y%°M  (4.23) 

SC(k)  = - MTA°M  + ATGTBTSA(k+1)3GA 

+ ATCl”K(k+1 )H]TSC(k+1 )[I-K(k+1 )H]A 
+ AT[S8(k+1 )K(k+1 )H-SS(k+1 ) )TBGA 

- ATGTBT(S3(k+l)X(k+1)H-SB(k+1)]A  (4.24) 

/3°(k)  = j3°(k+i)  + trQ°P°(k) 

+ tr(:<°(k+1  )[H°(k+1  )P°(k+1/k)HoT(k+l  J+Eptk+l ) ] 

K°(k+1  )[SA(k+1 ) + SC(k+1)-2SS(k+D]} 

+ 2tr{P°(k+1 /k)X(k+J )H(k+1 )( SS(k*l )-SC(k+l ) ] } 

+ trfC(k+1  )(H(k+i  )P°(k+l/k)KT(k-t-l  )-*-~(k+i ) ] 

XT(k+1)SC(k-l)  (4.25) 

S(k)  = Q + ( A-BA°Y]T(S(k+1 )-GT8TS(k+1 )]CA-3A°Y] 

+ YTA°RA°Y  (4.26) 

p( k)  = /3(k+1)  + trQP(k)  + trS(kf  1 )X(k+l  )[H(k+1  )P(k4-l/k)HT(k+1 ) 

+ H(k+l)]KT(k+l)  + tr{[P(k/k)-P°(k/k)KM-Y]TA° 

(R  + STS ( k+ 1 ) ( I-BG ] B ] A°[ M- Y ] } (4.27) 
:<(k+1)  = P(k+l/k)HT(k+l  )CH(k+i  )P(k+i/k)HT(k+1 )+  (k+l)]"1  (4. 28) 

P(k+1/k)  = A(k+l)P(k/k)AT(k+i)  + ,\(k)  (4.29) 

?(k+l/k+l ) = [I-K(k  -1 )H(k+i )]P(k*i/k)  (4.30) 

?(o/o)  = E(o) 

All  these  recursive  equations  can  be  solved  with  .?iven  initial  or  final 
conditions.  The  existence  condition  of  the  solutions  is  that  the 


matrices  to  be  inverted  are  nonsinguiar . 
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4.4  Constrained  Decentr?.! Iced  Structure 


Section  3-4  describes  why  output  feedback  and  dynamic  output 
feedback  are  more  desirable  in  practical  applications.  In  this  section 
we  will  derive  the  necessary  conditions  for  Stackelberg  coordination 
when  the  lower-levels  choose  to  use  Pareto  optimal  solution  with 
constraint  being  placed  on  the  controls.  The  cost  functional  of  the 
lower-level  is 


2 2 

J(k)  = aaO,  i 

i=i  i»i 

J(k)  = £xT(N)Q(N)x(N) 


N-1 


r]T]  CxT(k)Q(k)x(k) 


K=0 


+ 4'£»ibiT(k)Ri(k)ui(k)] 
"i=1 


(4. 3D 


where 


Q=  Yj 


alQl 


is  1 


4.4.1  Decentralized  Control  with  Instantaneous  Output  Feedback 

When  the  controls  are  constrained  to  be  a linear  transformation  of 
measurement  at  that  instant  and  there  is  no  information  transfer  through 
the  control,  then 

ui(k)  = Fi(k)zi(k)  i=0 ,1,2  '4.32) 

and 

zi(k)  = Hi(k)x(k)  + ^(k)  i=0, 1 ,2  (U.33) 


where  F"(k)  is  to  be  determined 
Ji(ui) . 


to  minimize 


he  expected  value  of 


53 


Consider  the  augmented  system  (4.1)  and  the  measurement  (4.33).  Then 


ui(k)  = F~(k)H~(k)x(k)  + Fi(k)^i(k)  1=0 ,1,2 


and 


x(k+l)  = [A(k)  + 3i(k)F1(k)H1(k)]x(k) 

2 . i = ° 

+ Z.  Bi(k)Fi(k)fi(k)  + v(k) 

i 30 

T 

Then  the  recursive  eauation  for  ?(k)=S{x(k)x'(k)}  is  given  by 


(4.34) 


2 2 

P(k+1 ) = CA(k)  + y;  3i(k)Fi(k)Hi(k) ]P(k)C A(k)  + ^ 3x(k)F1(k)Hi(k) ]T 
i=o  i=o 

2 .... 

+ 3i(k)F1(k)H1‘<^>FlT(k)3:LT(k)  + A(k)  (4.35) 

i=o 


Lemma  4.1  If  the  linear  system  described  by  (4.1)  is  controlled  using  a 
linear  control  policy  (4.32),  i=1,2  then  the  expected  cost  (4.31)  i=1,2 
can  be  expressed  as 

M 

S[ J(k)]  = ^S[xT(k)S(k)x(k)]  + \ YL  trS(l)A(l-1) 

lsk+1 

M2 

+ 3 £ ( £ tr?lT(l-1 )(a1R1(l-1 ) 
i=k+1  i-i 

+ BiT(i-1 )S(l)8i(L-1)3Fi(i-l)=i(l-1) 

+ trFoT( 1-1 )BoT( 1-1 )S( 1)B°( 1- 1 )F°( 1-1  )E°( 1-1 ) 1 (4.36) 

where 


S(k)  = Q(k)  + ^a1HlT(k)?li(k)ai(k)Fi(k)Hi(k) 
i=1 

2 

+ [A(k)^£  3i(S)Fi(k)Hi(k)3TS(kJ-1 ) 


1=0 

5 


(4.37) 


1=0 


S(M)  = Q(M) 
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The  necessary  condition  for  a minimum  at  each  step  is  that  the 
derivative  of  the  remaining  cost  with  respect  to  Fi(k);  i=1,2  must 
equal  zero. 


F1 *(k)  = -[R1+S1TS(k+1)B1]“1B1TS(k+1)CA+30F0(k)H°+B2F2(k)H2] 

P(k)H1T[H1?(k)K1T+E1]-1  (4.33) 

F2*(k)  = -CR2+B2TS(k+1)B21"1B2TS(k+1)[A+80F°(k)H°+B1F1(k)H1] 

P(k)H2T[H2P(k)H2T+H2r1  . (4.39) 

F 1 * ( k ) = r’[A  + 3°F0(k)H°]T1  (4.40) 

F2*(k)  =r2[A  + 8°F°(k)H°]T2  (4.41) 


where 

r 1 = Cl“Mi3^M^Bi]“1[Mi  + i=1,2,  j=1,2,  ilj 

T1  = C Y*-  + yJh'MHI-hMhVT1  i=  1 , 2 , j = 1,2,  i?j 
M1  = -CaiRi  + 3iTS(k+1)Bi]"13iTS(k+1)  i=1,2 

Y1  = P(k)HiTCHiP<’.<)HiT  +Zi3"1  i=1 ,2 


Lemma  4,2  If  a linear  system  described  by  (4.1)  is  controlled  using  a 
linear  control  policy  (4.32)  1=0  then  the  expected  cost  (4.31)  i=0  is 
expressed  as 

N 

E(J°(k)3  = is[xT(k)S°(k)x(k)3  7(  E trS°(i).\(i-1) 

i=k+l 

+ trF°^( i-1 )C  R°(i)+BoT( i)S°(i)B°( i) JF°( i-1  )Z°(i-l ) 

+ E trF )BJT( i-1 )S°(i)Bj(i-i )F-*(i-l HJ(i-i ) } 
j=l 


(4.42) 
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where 

S°(k)  = 0°(k)  * H°T(k)FoT(k)R0(k)F0(k)H°(k) 

2 . 

4-  C A ( k ) + 3°(k)F°(k)H°(k)  * £ 31(k)F1  (k)H1(K)]TS°(k+l ) 

i=  1 
2 

C A ( k ) + 3°(k)F0(k)H°(k)  + X]3i(k)Fi  (k)Hi(k)3  (4.43) 

i=1 

S°(N)  = Q°(N) 

At  each  step  the  ncessary  condition  for  a minimum  is  that  the  derivative 
of  the  remaining  cost  with  respect  to  each  element  of  F°(k)  must  equal 


aero. 


F°(k)  = -CR°  + (3°+31r,S°+3;r2B0)TS0(k+1)(50*B1r130+52r2B0)]"1 
{ ( 3°+8  V1 3°+B2r2S° )TS° (k+ 1 ) C A+3  V1  AT 1 H 1 +32nAT2H2 3 ? 

[ H°+H°T 1 H 1 +H°T2H2  ]T+[  3 ’r1 30*3T,2B°]TS0 i k-t- 1 ) 

C 3 ’r1  at •s1  thoT+3t'2at2=:2t2thoT  3 } 

( H°PHoT+ ( H°T 1 H 1 +H°T2H2 ) P ( !5°+H°T ! K 1 ^H°T2H2 ) 

+ G°+H°T  1Z1T1THoT^H°T%2T2THoT ) 3 " 1 (4.44) 


Theorem  4 , i The  sequence  ( F ^ ( k ) > 1=0, 1,2;  k=0 , 1 , . , . , n-1  of  the  i-th 
subsystem  that  minimizes  E ( J ^ ( k ) } i=0,l,2  subject  to  the  constraint 
(4.32)  are  given  by  the  equations  (4.44),  (4.40)  and  (4.4l)  where  it  is 
assumed  that  the  required  inverses  exist  and 


1.  ?(k+0  = [A  + V 3iFi(k)Hi3?(k)[A  + SiFi(k)Hi]T 


1=0 


1 = 0 


+ y]3iF1(K)c1Fi-(k)31-  + A(.k) 
i=J 


(«.45) 


PlO)  is  given. 


56 


2.  S(k)  = Q + JaiHiTFiT(k)t?iFi(k)Ki 

i=  1 

2 2 

+ [A  + JZ  3iFa-(fC)Hi]TS(k+0 C A + XI  BiF1(k)Hi] 
i=o  i=o 

S(N)  = Q(N) 

3.  S°(k)  = Q°  + HoTFoT(k)R0F°(k)H0 


+ [A  + 9°F°(k)H0  + ]T  BiFi*(k)Hi]TS°(k+1) 
i=1 
2 

LA  + B°F°(k)H°  + £ B1Fi*(k)Hi] 

' i=1 


S°(M)  = 0°(N) 


(4,46) 


(4.47) 


To  compute  the  cost  incurred  when  the  players  use  arbitrary  linear 
output  feedback  control  of  the  form 

ui(k)  = Ki(k)si(k)  (4.48) 

the  cost-to-go  at  stage  k is 

Ji(k)  = 7ECxT(k)S1(k)x(k)3  + jtrS1(k+l )A(k) 

M 

+2  XI  ( trKiT( 1— 1 ) C R1 ( 1— 1 ) + BiT(l-1)Si(l)3i(l-l)] 
l=k+l 

2 

:<i(l-OSi(l-l)  + XI  trSC^T(  1-1  )3^T(  1-1 ) 
j=o 

Si(l)Bj(l-1)sJa-l)Sj(l-l)i  (4.49) 


where 

3i(k) 


HiT:<iT(k)Ri:<i(k) 


H1 


[A  + fl3iKi(k)ai]TSi(k+nCA  + f'3iKi(k)Ki]  (“.50> 


1 = 0 


Si(M)  = Qi(M) 
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The  sequence  i=0,1,2;  k=0,1 n-1  of  the  coordinator  and 

the  i-th  subsystem  are  the  solution  to  the  discrete  two-point  boundary 
value  problem.  The  simple  procedure  to  solve  the  equations  suggested  in 
Section  3-4.1  is  also  recommended  here. 

4.4.2  Decentralized  control  with  dynamic  output  feedback 

When  the  controls  are  constrained  to  be  a linear  dynamic  output 
feedback  where  a dynamic  controller  of  a specified  order  for  the  i-ch 
subsystem  and  the  coordinator  described  by 

wi(k+1)  = D1(k)w^(k)  + M^(k)z1(k)  i=0,1,2  (4.51) 

where  wx  £ a x is  the  state  vector  of  the  controllers  used,  then 

ui(k)  = Ni(k)wi(k)  + Fi(k)zi(k)  =0,1,2  (4.52) 

also 

zi(k)  = Hi(k)x(k)  + ^(k)  i =0 ,1,2  (4.53) 

For  a given  integer  s1  (Q^s^n)  find  matrices  N1(k),  F1(k),  D1(k)  and 
M"(k)  such  that  the  corresponding  expecced  cost  S[J1(k)}  will  be 
minimum.  The  cost  functional  to  be  consider  is  the  same  as  in  Section 
4.4.1 . 


w2l(k)  ] 


Consider  the  augmented  state  vector 

x1  ( k ) = C xi('«c)  wo7(k)  w1T(k) 
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;hen 


where 


and 


2 . 2 

x(k+1 ) = (A  + B^H1)^)  + ^3Bl?iIi^i(k)  + Iv(k) 

i=o  i=o 


(4.5*1) 


Fi(k)  = 


F*(k)  N1(k) 

Si(k)  Di(k) J 
r 

‘-I  _ . 

i £> 


L 0 


1 = 0 


ux(k) 


where  T 


TFi(k)Hix(k)  + TFi(k)Ii|i(k) 
ll!0] 


1=0,1, 2 


(4.55) 


n £>i 
l=o 

Let  ?(k)  = E C x ( k ) xT ( k ) ] and  the  cost  functional  of  the  coordinator  is 
J°  = xT(N)Q°(N)x(M) 


N 


+ 2 2ZCxT(k)0°('<.;x(k)  + uoT(k)Rc(k)u°(k)3  (4.56) 

k=o 


Also,  the  cost  functional  of  the  lower-level  subsystem  is 

J = fVj1;  xho,  Z2*~  = 1 
i=l  i= 1 

= 5XT(M)Q(N)x(i\') 

M-1  ^ 2 

* 4X1  Cxi(k)Q(k)x(k)  * X~,«-uiT(k)Rz(k)u1(k)]  (4.57) 

”k=o  i=i 

The  augmented  system  (4.51)  and  controller  (4.55)  are  of  the  same  form 
as  (4.34)  and  (4.32).  Also  the  cost  functional  are  the  same.  The 
following  theorem  can  be  derived  using  the  same  argument  as  Theorem  4.1. 
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1 
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Theorem  UL  The  sequences  { F1  (k ) } i=0,1,2;  k=0, 1 , . . . ,n-1  of  the 

coordinator  and  the  i-th  subsystem  that  minimize  SCJ^Ck) } 1=0, 1,2 

subject  to  the  constraint  (4.55)  are  given  by 

F°(k)  = -[R°  + (90+B1r130+32r230)TS0(k+1)(B0+31r13°+32r2B0))''1 
( ( B°+3 1 H B°+32r2§° ) TS° ( k+ 1 ) [1+3  V1  AT 1 H 1 +B2r2 AT2H2 ] P 
[H°+HoT1K,+H0T2H2jT+Co1r'80+32r2S0]TS°(k+1) 

C 3 V1  AT  1T1TH°T+B2r2AT%2T2THoT  3 } 

[ H°PHoT+ ( H°+H°T 1 H 1 +H°T2K2 ) ? ( H°+H°T 1 H 1 +H°T2H2 ) 

+ (H0+H°T 1 S1  T 1 THoT+H°T2S2T2THoT ) ] " 1 (4.53) 

F1(k)  = r1(k)C A(k)+3°(k)F0(k)H°(k) jTi(k)  1=1,2  (4.59) 

where 

T1  = 1=1 ,2, ja* ,2, i* j 

T1  = CYi+YjSJYi3Cl-SiYJH'iTi]“1  1=1,2,  j = 1 ,2,  i?j 
M1  = -fc^+B-Stk+l  )^i]-1BiTS(5c+1 ) i = 1 ,2 
Y1  = P ( k ) H iT CS1? ( k ) H iT+3^ ] ” *'  i=l , 2 

It  is  assumed  that  the  required  inverse  matrices  exist  where 

1.  P(k+1)  = CA  + Y2  31?1  ( k ) K"*- ] P ( k ) [A  + 2Z  3iFi(k)Hijr 
i=o  i=o 

+ EB^dCE^tkrB11  + A (k)  (4.50) 

i=o 

?(0)  is  given. 


2.  S(k)  = 3 * 23*lS1IFiI(it)fi1?lOc)§1 


i=  1 


+ tt  * (is. on 


1=0 


1 = 0 


S(M)  = Q( M) 
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%°{k)  - 3°  + H°TF°r(k)R0F0(k)H0 


4-  CT\  + B°F°(k)H°  * EWVkjaWi^i) 

1=1 

a •«•  B°F°(k)H°  - £ 81Fi*vk)HJ'3 
1=1 


Cl. 62) 


S°(N)  = Q°(N) 


Again  the  sequence  (F^ik)}  1=0, 1,2;  k=0,l N-l  of  the  coordinator 

and  the  i-th  subsystem  ire  the  solution  to  the  discrete  two-point 
boundary  value  problem.  The  procedure  used  in  Section  U.H.l  can  be  used 
to  solve  for  the  solutions. 


l>.5  CmcJULbiilM 

As  in  all  nonzero-sura  differential  games,  there  are  a variety  of 
"optimal  solutions",  since  the  lower-level  may  or  may  not  cooperate 
within  their  group.  When  the  lower-level  subsystems,  which  are  all 
followers  with  respect  to  the  coordinator  or  leader,  desire  to  cooperate 
within  their  group,  the  Pareto  optimal  solutions  are  obtained.  Both 
centralized  and  decentralized  controL  structures  were  considered.  The 
main  idea  is  the  same  as  in  Chapter  3* 
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5.  DECENTRALIZED  STOCHASTIC  STACKEL8ERG  COORDINATION 
IN  AUTOMATIC  GENERATION  CONTROL  OF  INTERCONNECTED  POWER  SYSTEM 

5.1  Iq&r.QdUfi&iQa 

An  interconnected  electric  energy  system  can  be  described  as  a 
collection  of  subsystems,  each  of  which  is  called  a control  area.  Each 
area  is  responsible  for  meeting  its  obligation  to  maintain  the 
appropriate  system  frequency  and  supply  its  own  load  demand.  Also,  each 
area  provides  mutual  assistance  to  its  nelghQors  in  accordance  with  the 
basic  operating  policy  of  interconnected  power  systems  [23].  Two  of  the 
most  important  aspects  of  system  control  involve  the  regulation  of 
system  frequency  and  net  power  interchange.  When  the  interconnected 
network  is  small  centralised  techniques  can  be  used  quite  effectively 
[19,20,333.  However,  in  the  more  general  case  the 
coramunication/computational  costs  involved  in  implementing  a centralised 
controller  often  become  essential.  Furthermore,  the  trend  in  the 
utility  industry  is  strongly  to  digital  control,  using  the  digital 
computer  for  calculating  generation  changes  etc..  A discrete 
formulation  of  this  problem  would  thus  seem  of  more  oractioai  interest. 

Interest  in  the  dynamical  asoects  of  load  frequency  control  has 
stimulated  the  application  of  modern  control  techniques  to  this  problem, 
particularly  the  theory  of  ootimal  linear  regulator  [25].  Calovic 


[19,20]  was 

the  first  to  ciea 

ry  distinguish 

the  steady 

state  problem 

from  the  tr 

•ansient  oroblem. 

The  procedure 

used  is 

to  adjoin  the 

integral  of 

each  area  control 

error  (ACS*  = 3^^  ^?cta 

f where  9,-  is 
» - 
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the  tie  line  bias  constant  specified  as  the  area  frequency 
characteristic. ) to  the  system  equations  (Tosha  and  Slgerd  [25]  adjoin 
integrals  of  frequency  and  tie  line  flow  errors).  These  new  state 
variables  as  well  as  the  original  system  state  variables  are  included  in 
the  cost  functional.  As  a result,  all  areas  capable  of  doing  so  will 
drive  their  area  control  errors  to  zero  in  steady  state  provided  the 
system  is  stable.  It  is  not  clear  from  the  control  equations  what 
control  actions  would  be  taken  in  each  area  if  any  area  is  not  able  to 
control  optimally. 


Decentralized  Stackelberg  strategies  will  be  used  to  developed  a 
decentralized  controller  for  a three  area  electric  power  system.  This 
new  design  procedure  is  based  on  a stochastic  Stackelberg  strategy 
extended  by  introducing  optimal  regulation  with  individual  choice  of 
cost  functional  to  each  control  area.  The  problem  now  becomes  a 
multicriteria  problem  with  muit L-decision  makers.  This  is  where 
differential  games  theory  is  relevent  to  define  "optimality".  Once  the 
optimality  is  defined,  we  can  calculate  Xj  (Kr  is  the  controller  gain 
used  by  the  area  to  accomplish  the  required  control  action  on  the  error 
ACE)  which  vary  between  areas  because  of  differences  in  dynamics  and 
disturbance.  The  control  laws  are  linear  functions  of  measurable  output 
for  each  control  area  and  do  not  require  measurement  of  disturbance. 
This  new  decentralized  Stackelberg  coordination  is  investigated  for  a 
three-area  interconnected  power  system.  Optimal  solutions,  suboptima  I 


simplifications  and  simulation  results  are  presented. 
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5.2  Power  Svstem  Dynamic  Model 


A power  system  dynamic  model  was  developed  in  [20,25].  where  it  is 
assumed  that  area  buses  are  stiffly  interconnected,  and  that  the 
deviations  in  frequency  and  scheduled  power  interchange  are  caused 
solely  by  the  load  disturbances.  If  each  area  is  modeled  as  an 
equivalent  electric  generating  system  wherein  a nonreheat  steam  turbine 
is  employed,  then  the  following  ecuations  represent  the  interconnected 
power  system  linearized  about  a given  nominal  operating  point: 

d_(ar< ) 


= Jfij/af,  - !fVti-*Pti e.r^u.) 

2Ht 


2Hi 


d>Pti) 

dt 


d_  ) 
dt 


Tti 


_L&P  ,-as -r±if, ) 

V « ' 

M 


l(APtie  iJ  = HTij(£;V<afj) 


dt 
ACS, 


= b^f,  + 6Ptie|i 


(5.D 

(5.2) 

(5.3) 

(5.U) 

(5.5) 


The  symbols  are  defined  as  follows: 

■ ,1 


nominal  system  frequency 


3.: 


b, 

X 


inertia  constant 
system  damping 
turbine  time  constant 
governor  time  constant 
transmission  constant 
speed  drooo 

frequency  bias  constant 
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& f. 


&Pt, 


ASsi 


&PH 

Vi  1 


ie,i 


AP 


ci 


A?Li 


*ik 


The  values  of  the 


frequency  deviation 

turbine  output  deviation 

governor  position  deviation 

net  power  flow  deviation 

control  signal-command  to  speed  changer 

load  disturbance 

plant  noise 

parameters  are  as  follows: 


Pro  = ?rl 


Ho  « H1 

Do  8 D1 
T,„  = T 


tl 


= T. 


l20  ~ Vgt 
R„  = R-i 


P . 

r tie, max 


= Pr2  = 2000  MW 

= H0  = 5 seconds 

= D2  = 8.33x10"3  pu  MW/Hz 

= Tt2  = 0.3  sec. 

= = 0.08  sec. 

= R2  = 2.4  Hz/pu  MW 

= 200  MW 


8j-Sj  = 30  degrees 

T • j = 0.545  pu  MW 
bL  = 0.425 


For  more  complete  definition  of  the  .model  and  terms  see  [20,253. 


An  appropriate  formal 
following  linear  quadratic 
state  eauation  x(c)  = 

output  equation  y(t)  = 

cost  function  J = 

where,  for  each 


ization  of  this  problem  involves  def 
regulator  problem: 

Ax ( t ) + Bu(t)  + Dv(t)  + £{t) 

Hx  ( t ) + 7j(t; 

:c 

J*  "T*  m i 

(x*Qx  + u'Rujct 

control  area  the  state 


ng  the 


(5.5) 
(5.7) 
( 5 . 3 ' 
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x1  = (Afif  APfci,  ASgi,  APtie>i,  IACEi)  (lACE^  = f ACE^t) . These  new 
state  variables  are  included  for  the  purpose  of  inducing  the  steady 
state  errors  [20]),  the  control  vector  u,;  =APci  , and  the  disturbance 

vector  v^  rAp^,  with  1=1, ,n.  The  plant  and  measurement  noise  vector 

$(t)  and  V (t)  respectively,  are  modeled  as  zero  mean  mutually 
independent  stationary  white  Gaussian  processes.  The  matrices  R and  Q 
in  the  cost  functional  are  selected  in  such  a way  that  emohasizes  the 
ACS., . For  simplicity  we  choose- R = I. Here  it  is  assumed  that  each  area 
has  only  one  plant. 

5.3  Stackelber?  Coordination 

In  accordance  with  the  basic  operating  policy,  the  desired  goal  is 
to  regulate  each  area  control  error,  ACE^,  to  zero  without  using 
excessive  control  effort.  Each  control  area  problem  can  be  formulated 
as  a linear  regulator  problem  with  a cost  functional  of  its  own. 
Oecision  making  by  any  area  to  obtain  optimum  control  performance  for 
its  area  will  effect  other  areas.  With  multicriteria  and  multidecision 
making  we  have  to  define  "optimality".  In  differential  games  theory 
"optimality"  is  defined  in  term*’  of  the  rationality  assumed  by  the 
decision  makers  in  computing  their  controls.  Each  area  can  choose  2 
strategy  denending  on  the  dynamics  of  its  system,  Its  information  and 
its  computational  capability.  Since  we  hove  more  than  two  areas,  it 
seems  appropriate  to  apply  Stackeiberg  coordir.a tion  for  decentralized 
control  to  this  problem.  Designate  an  area  to  be  3 coordinator  wno 
coordinates  the  other  areas  which  are  viev/ea  as  followers.  The 
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coordinator  chooses  a leader  Stackelberg  strategy  to  play  with  the  lower 
level  subsystems.  The  lower  level  subsystems  may  or  may  not  cooperate 
among  themselves  so  they  can  either  choose  Mash  rationale  or  Pareto 
rationale  to  play  between  them. 

The  controllers  are  constrained  to  be  of  the  form 

ui(t)  = Fi(t)yi(t)  i=0 ,1,2  (5.9) 
where  y^t)  is  the  measurable  output  of  each  area  and  F1(t)  is  chosen  so 
as  to  minimize  the  cost  functions.  The  resulting  necessary  conditions 
for  optimality  of  F1,  for  discrete  system,  are  derived  in  Section  3. 11 
and  Section  4.4.  A simple  approximation  computational  algorithm  is  also 
suggested,  but  there  is  r.o  guarantee  that  the  algorithm  will  converge. 


5.4  Design  and  Simulation  Studv 


A three-area  power  system  with  numerical  constant  as  given  in  [24] 
was  chosen  as  the  basis  for  this  study.  In  discretization  of  the 
system,  LINSYS  [11]  was  used.  Since  we  are  only  interested  in 
load-frequency  control,  we  can  consider  the  turbine  controller  fast 
relative  to  the  rest  of  the  system.  3y  assumption  above  the  time 
constant  of  the  system  is  approximately  1 sec.  [24],  so  we  chose  a 
discretization  interval  of  0.2  sec..  After  discretization  LINSY3  was 
used  to  determine  the  eigenvalues,  controllability,  and  observability  of 


the  discrete-time  system. 


The  discrete-time  system  with  discretization 


interval  0.2  sec.  is  stable  and  controllable. 
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Consider  a discrete  version  of  a three  area  interconnected  power 
system: 

state  equation  x(k+1)  = Ax(k)  + B°u°(k)  + B 1 u 1 ( k ) + 32u2(k) 

+ Ew(k)  + v(k)  (5.10) 

measurement  equation  y^k)  = Hix(k)  + J)*(k)  i=0,l,2  (5.11) 

cost  function  J*  = x*(n)Qi-  (n)x(n)  + 1 (xT(k)0i  x(k) 

k=o 

+ uiT(k)Riui(k)]  i=0, 1 ,2  (5.12) 

where  for  each  area  the  state  vector  is 
xi(k)  = (xi1,  ....  xi!}) 


= (^fi>  ^?ti’  ^Sgi’  ^?tie,i) 

The  control  vector  is  u*  = £>PC,-  and  the  disturbance  vector  is  w.  = dSPjp 
where  i=0,1,2.  The  plant  and  measurement  noise  vectors  v(k)  and  T)1(k) 
are  zero  mean  mutually  independent  stationary  white  Gaussian  processes 
with  0.001  per  unit  standard  deviation.  The  matrices  appearing  in  the 
cost  function  are  defined  as  in  the  continuous  case.  The  measurable 
output  vector  is  formed  as  a linear  combination  of  states  required  to 
have  zero  steady-state  values  &fpAP,.<Q  The  numerical  value  of  the 
element  of  matrices  appearing  in  (5.10)  are  given  in  Appendix  3.  The 
object  is  to  design  a linear  feedback  control  u ~ ( k ) i=0  1,2  to 
compensate  the  effect  of  constant  or  slowly  varying  disturbance  wi,k' 
using  only  the  output  y1(k).  For  any  constant  or  slowly  varying 
disturbance  w(k),  using  the  Smith/Davidson  (55]  aoproach,  consider  the 
augmented  system: 

xCk+1)  = 'Ax'(k)  + s'u'vk)  + 32u2(k)  o°u°(k)  + v(k)  (5-13) 

yi(k)  = H^xCx)  ->•  nx('<)  i=0,i,2  (5.14) 
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x(k) 


1 , u^k)  = [ui(k+1 )-ui(k)] , 


where 

fx(k+1)-x(k)] 
y(k) 

y-^Crc)  = V(k+1  )->yi(k)'] 
y'(k) 

The  linear  control  law  u^k)  is 

u1 ( k ) = Fi(k)yi(k)  i=0,1,2  (5.15) 

where  F^Ck)  is  determined  using  the  decentralized  stochastic  Stackelberg 
method. 


Area  0 is  cnosen  to  be  coordinator  or  leader.  Then  area  1 and  2 
are  followers  with  respect  to  area  0.  When  the  lower  level  subsystems 
choose  to  play  Mash  rationale  the  resulting  controllers  are  as  defined 
in  Section  3.H.I.  When  the  lower  level  subsystems  choose  to  play  Pareto 
rationale  the  resulting  controllers  are  as  defined  in  Section  M.H.l. 
The  matrices  S1  and  Q1  appearing  in  the  cost  functional  (5.12)  are 
selected  in  such  a way  that  the  cost  function  for  each  area  is 


ri  - 


v&f?(fc+l)  (k+i ) + ACsJ(k+l) 

k=i  - C‘e,‘ 


u^vk) 


1=0,  i ,2 


5.5  Oi scussion  on  A laorlthm  and  Results 

So  far  no  convergence  conditions  for  this  algorithm  have  beer, 

found,  but  as  with  most  algorithms  of  this  type  it  is  felt  that 

convergence  depends  on  the  initial  guess.  A test  for  satisfactory 

convergence  in  cost  is  inserted  when  the  computational  procedure  is 

implemented.  The  iterative  procedure  converged  m cost.  From  the  test 
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results,  one  might  hope  that  it  would  always  converge  to  the  'optimal 
solution'.  Unfortunately  for  certain  systems  the  limiting  values 
produced  depended  on  the  initial  guess.  In  these  cases,  the  algorithm 
converged  to  a .solution  to  a two  point  boundary  value  problem  one  of 
whose  solutions  is  the  optimal.  It  is  the  nature  of  specific  optimal 
problems  to  have  local  as  well  as  global  minima.  Since  uniqueness  has 
not  been  proved,  all  solutions  to  the  boundary  value  problem  must  be 
found  to  determine  the  global  minimum.  This  difficulty  with  uniqueness 
could  be  anticipated  since  the  necessary  conditions  are  local.  One  must 
therefore  find  a good  starting  point  if  the  procedure  is  to  converge  to 
the  optimum. 

The  computational  algorithm  for  the  solution  of  this  problem 
suggested  in  this  work  can  not  guarantee  satisfactory  results.  For  this 
particular  example  the  algorithm  has  exhibited  rapid  convergence  so  no 
more  exotic  techniques  have  been  tried.  The  method  developed  in  this 
work  is  suitable  for  solving  finite  time  problems.  Unnecessary 
complexity  is  particularly  burdensome  in  these  problems  as  the  time 
records  of  all  the  controller  gains  must  'oe  stored.  The  algorithm 
proposed  can  provide  solutions  for  many  problems  at  a reasonable  cost, 
but  it  should  be  noted  that  the  computer  time  will  increase  as  the  state 
dimension  of  the  system,  the  number  of  gains  and  the  number  of  time 


incervals  increase. 
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Fig  5.1  shows  the  curves  of  frequency  and  tie-lie  variations  for  a 
free  system  response  upon  a t*  step-load  change  in  area  0.  Fig  5.2 
shows  the  system  response  under  output  feedback  Stackelberg  coordination 
with  the  lower-level  using  Nash  rationale  within  their  group.  The 
diturbance  is  the  same  as  in  Fig  5.1.  Fig  5.3  shows  the  system  response 
under  output  feedback  Stackelberg  coordination  with  the  lower-level 
using  Pareto  optimal  within  their  group,  ct ^ is  chosen  to  be  0.5.  The 
disturbance  is  the  same  as  in  Fig  5.1. 


From  the  results  of  the  comouter  simulation  study.,  it  is  concluded 
that  in  this  particular  example  decentralised  stochastic  Stackelberg 
coordination  retains  favourable  transient  features.  However,  the 
disturbed  area  still  has  a small  steady-state  error  in  deviation  of 
frequency  (.006  Ha.).  A ratio  of  the  coefficient  of  weighting  matrices 
and  R-'-  plays  an  important  role  in  system  response.  It  should  be 
noted  that  improper  choice  of  R1  and  Q1  car.  make  the  system  unstable  or 
this  algorithm  may  not  give  desired  system  response.  However,  a good 
choice  of  R1  and  Q1  depends  on  the  system.  3y  trial  and  error  the 
suitable  values  can  be  selected.  However,  the  implementation  of  these 
control  sequences  in  practice  is  comolex,  since  the  controls  vary  with 
time.  Therefore  we  suggest  a subootimal  simplification  of  the  control. 
These  subootimal  simplifications  are  selected  from  the  constant  part  of 
each  control  sequences,  respectively,  and  are  used  throughout  the  entire 
period.  Fig.  5.4  shows  the  olots  of  the  optimal  gains  of  area  0,1, and 
2.  The  constant  gains  of  each  area  are  chosen  to  be  (-.5, -.39). 


-S- 


(-.09, -.6)  and  (-.09, -.6)  respectively. 


5 . 5 shows 


the  s vs tern 
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responses  under  suboptimal  simplification . The  responses  do  not  have 
significant  difference  from  the  responses  under  optimal  solution. 

5.6  gsmaiaaiana 

In  this  chapter,  an  attempt  to  develop  a new  decentralized  linear 
regulator  approach  for  load-frequency  control  in  a three-area 
interconnected  power  system  has  been  discussed.  The  method  is  based  on 
decentralized  stochastic  Stackeiberg  coordination.  Each  control  area 
uses  a feedback  control  based  only  on  measurements  from  its  own  area. 
Also,  the  area  is  free  to  select  an  appropriate  cost  function.  The 
extended  theory  is  applied  to  a discrete  model  of  a three-area 
interconnected  power  system,  A numerical  design  method  utilizing  a 
proportional-plus-integral  control  structure  is  suggested.  From  the 
studied  example,  this  method  gives  satisfactory  results.  The  adjustment 
of  a desired  speed  in  dynamic  response  is  possible  by  adjusting  the 
elements  of  the  weighting  matrices  Q1  and  R1.  Unfortunately  the 
stability  and  convergence  of  the  procedure  has  not  been  established  yet. 
Since  constant  control  laws  are  preferable  in  practice,  we  also  suggest 
a suboptimal  simplification  in  the  controls  which  performs  quite  well  in 


our  particular  example. 
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6.  CONCLUSIONS 


In  the  first  part  of  this  thesis,  we  have  reviewed  the  equilibrium 
solutions  of  a two-person  LQN2SDG  in  which  we  have  modelled  the  effect 
of  random  disturbances  by  including  an  additive  zero  mean  white  noise  in 
the  3tate  dynamics,  whose  statistics  are  not  necessarily  known  to  the 
players.  Both  cooperative  and  noncooperative  solution  concepts,  i.e. 
Pareto  optimal,  Nash  equilibrium  and  Stackelberg  equilibrium,  are 
examined.  Results  available  in  the  literatures  indicate  that  solutions 
for  this  class  of  game,  and  for  different  strategies,  are  affine  for 
each  player. 

In  the  second  part  of  this  thesis,  an  interconnected  set  of  linear 
discrete-time  stochastic  systems,  where  N decision-makers  try  to 
minimize  different  criteria,  was  introduced  as  an  extension  of 
differential  game  theory.  The  organizational  form  of  the  system  permits 
one  decision  maker  to  be  the  coordinator  or  leader  and  the  decision 
makers  for  the  other  subsystems  are  all  followers  with  respect  to  the 
coordinator.  The  followers  may  or  may  not  cooperate  among  themselves, 
so  they  can  select  Nash  strategy  or  Pareto  optimal  with  respect  to  the 
other  second  level  decision  makers.  Centralized  and  decentralized 
control  structures  were  considered.  A decentralized  structure  is  more 
realizable  since  the  control  sequences  are  functions  of  measurable 
output  only.  The  equilibrium  solutions  are  obtained  via  dynamic 
programming.  The  solutions  of  the  centralized  structure  , botn  perfect 


78 


and  nested  information,  can  be  obtained  backwards  in  time  with  given 
final  conditions.  But  decentralized  constraints  lead  to  a discrete 
two-point  boundary  value  problems.  A simple  procedure  to  solve  this 
problem  is  suggested  but  the  conditions  for  convergence  are  not  yet 
available.  As  with  most  problems  of  this  type,  the  solutions  depend 
very  much  on  the  initial  guess. 

Finally,  decentralized  Stackelberg  coordination  is  applied  to  a 
three-area  interconnected  power  system.  This  method  allows  each  control 
area  to  select  an  appropriate  cost  function  and  feedback  only  its  own 
area  measurement  which  is  more  realistic  in  practical  situation.  The 
design  procedure  is  emphasis  on  the  proportional  plus  integral  feedback 
control.  The  study  gave  a satisfactory  results. 

Further  study  of  'decentralized  Stackelberg  coordination  should 
include  the  stability  and  convergence  condition  of  the  procedure. 
Comparison  of  this  control  with  other  controls  is  also  suggested. 
Another  interesting  extension  of  this  work  would  be  to  investigate  the 
stochastic  Stackelberg  coordination  of  nonlinear  systems.  Since  the 
differential  dynamic  programming  failed  to  obtain  the  solutions  to 
M-person  nonzero-sum  Mash  equilibrium  solution,  the  same  oroblem  still 
exists  for  using  this  method  to  solve  nonlinear  stochastic  Stackelberg 


coordination . 
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APPENDIX  1 


Consider  augmented  system  (3.12) 
x(k+1 ) = A(k)x(k)  + 8°(k)u°(k)  + BUkJu’Oc) 
+ 32(k)u2(k)  + v(k) 


then 

E[x(k+1)/z(k)=x]  = Ax(k)  + 3°u°(k)  + BV(k)  + B2u2(k) 

and  quadratic  cost  (3.14) 

J^u1)  = JxT(N)Qi(N)x(N) 

N-1  . ... 

+ CxT(k)Q1(k)x(k)+uiT(k)R1(k)u1(k)] 

k=o 


(A. 1.1) 


(A. 1.2) 


(A. 1.3) 


Assume  that  the  expected  cost-to-go  at  stage  k is 

E[Vi(k)/x(k) ] = |xT(k)Si(k)x(k)  + jyi(k)  i=l,2  (A. 1.4) 

then 

EC ViCk)/x(k) ] = min  ECyxT(k)Qi(k)x(k)+|uiT(k)Ri(k)ui(k)+Vi(k+i )/x(k) ] 

i - L 

it*4* 


i=1 ,2  (A. 1.5) 


when  k=N 

Vl(N)  = ixT(r.)Qi(N)x(N)  i=1,2  (A. 1.5) 

when  k=k+1 

S(Vi(k+l )/x(k)]  = ^(Ax(k)+3iui(k)+BJu^(k)+B°u0(k) )^S^(k+t ) 

( Ax(k)+B^u1(k)+3^u^(k)*3°u°(k) ) + ^trS^(k-*-l  ).\(k) 

+ ^-r'k-rl)  i=  1 ,2  i?  j 


(A. 1.7) 


80 


\ 


Using  (A. 1.7)  in  (A. 1.5)  to  obtain  u1(k)  that  minimize  the  expected 
value  or  the  cost  function 

u^(k)  = - [Ri+8iTSi(k+1)B13-18iTSi(k+1)[A:<(k)+B^u^(k)+80u°(k)] 

i=1,2  i*J  (A. 1.3) 


Let 

Li(k)  = [Ri+BiTSi(k+1)Bi}*13iTSi(k*l)  (A. 1.9) 

Then  (A. 1.8)  becomes 

ui(k)  = -Li(k)CAx(k)*3juj(k)+B0u0(k)3  i=1,2  if j (A. 1.10) 

For  2- subsystems  solve  for  u (k)  and  u“(k) 

u1 (k)  = -A1  (k) (Ax(k)+B°u°(k) ) (A.  1.11) 

and 

u2tk)  = -A2 ( k ) ( Ax ( k ) +3°u° ( k ) ) (A.  1.12) 

where 

A^k)  = [:-LiB'U3i3~1[Li-liBjLi]  i = 1,2  ifj  (A. 1.13) 

Using  (A, 1.11)  and  (A. 1.12)  in  (A. 1.1)  and  defining 
l(k)  = A + bVa  + S2A2A  (A.  1.14) 

U(k)  = B°  - S1^1 3°  - 32A2B°  (A. 1.15) 

We  have 

x(k+1)  = A(k)x(k)  + ’§(k)u°(k)  * v(k)  (A.1.16) 

Now 

2CV°(k)x(k)3  = 5XT(k)S°(k)x(k)  + Jy°(k)  (A. 1.17) 

Then 


E[V°(k)/x(k) 3 = min  E[4xT(k)Q°(k) x(k)+5UoT(k)R0(k)u0(k)~V°(k+l )/x(k) 3 
u°(k)  2 2 

= rain  [5XT(k)Q°(k)x(k)+5U0t(k)R0(k)u0(k)+E[V°(k+l )/x(k)3 3 
u°(k)  “ 


(A. 1 . i8) 
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At  k=N , 

V°(M)  = |xT(N)Q°<M)x(N)  (A. 1.19) 

at  k=k+1 


E[V°(k+1 )/x(k) ] = ^(A(k)x(k)+B(k)u°(k) )TS°(k+1 )(A(k)x(k)+8(k)u°(k) ) 


+ ^trS°(k+1 )A(k)+g7°(k+l ) 

(A. 1.20) 

Using  (A. 1.20)  in  ( A . 1 .18)  we  obtain 

u°*(k)  = -[R0+BTS0(k4-1)'a]"18TS°(k+1)Ax(k) 

(A.  1.2V) 

Let 

L°(k)  = CR°+BTS0(k+1)B]'1'3TS°(k+1)'X 

(A. 1.22) 

Then 

u°*(k)  = -L°(k)x(k) 

(A. 1.23) 

To  obtain  recursive  equation  for  S°(k),  use 

(A. 1.23) 

in  (A. 1.18) 

and  after  some  algebra 

S°(k)  = Q°(k)  + ATS°(k+1)A-LoTLR0+'§TS0(k+1)33L° 

(A. 1 .24) 

S°(M)  = Q°(M) 

(A. 1.25) 

v°(k)  = y°(k+1)  + tr  S°(k+l).\(k) 

(A. 1.26) 

r°(M)  = 0 

(A.1 .2?) 

To  obtain  recursive  equations  for  S“(k) 

1=1,2,  use  (A.  i.23), 

(A. 1.11),  (A. 1.12),  and  (A. 1.5).  after  some  algebra 

Si(k)  - Qi(k)  + (A-30L°)WT(k)RiAi(k)(A-3°L0) 

+ (A-oL0)3i(k+1 ) (3-t)L0) 

i = l,2 

(A.1 .23) 

3“(M)  = Qi(N) 

1=1,2 

(A.1 .29) 

^(k)  = y^Ck+l)  + trS~(k-rl  ).\(k) 

i = 1 0 
- , — 

(A. 1.30) 

rxU)  = 0 

i=1,2 

(A. 1 .31) 

32 


APPENDIX  2 

Given  a stochastic  Markov  sequence  of  state  vector  (x(k)} 
x(k+1)  = A(k)x(k)  + B°(k)u°(k)  + B1(k)u1(k)  + 32(k)u2(k)  + v(k) 


(A. 2.1) 

where  u1(k),  i=0,1,2  are  deterministic  inputs,  v(k)  random,  and 


measurements 

given  by 

It 

s-/ 

N 

z2(k)  = H(k)x(k)  + £(k) 

(A. 2. 2) 

z°(k)  Dz1(k);z°(k)  = H°(k)x(k)  +£°(k) 

(A. 2. 3) 

The  assurapti 

ons  are  the  same  as  given  in  Section  3.2.  Define 

z*(k) 

= tz1T(0),...,z1T(k)]T 

(A. 2. 4) 

2°*(k) 

= CzoT(0) , . . . ,zoT(k)] : 

(A. 2. 5) 

x(k) 

= £[x(k)/z*(k)] 

(A. 2. 6) 

x°(k) 

= E[ x(k)/z°*(k) ] 

(A. 2. 7) 

P(k/k) 

= £{(x(k)-x(k) )(x(k)-x(k))1/z*(k)} 

(A. 2. 8) 

x(k+1/k) 

= E[x(k+1 )/z*(k)] . 

(A. 2. 9) 

The  recursive  relations  define  the  conditional  expectations 

for  lower 

level  assumptions  given  by 

x(k+1/k) 

= A(k)x(k)  + 3°(k)u°(k)  + B1 (k)u 1 (k)  + 32(k)u2(k) 

(A. 2. 10) 

?(k+1/k) 

= A (k+1  )?(k/k)A?(k*1 ) + ,\ (Sc) 

(A. 2. ’I) 

x(k+1 ) 

= x(k+1/k)  + K(k+1 )C  z(k+l ) -H(k+1 )x(k+1/k)] 

( A . 2 . 1 2 ) 

K(k+1 ) 

= P (k+1 /k ) HT(k+ 1 ) C K( k+1 ) ? ( k+ 1 )HT(k+1 )+S(k+l ) ]“’ 

(A. 2. 13) 

?( k+1 /k+1 ) 

= [I-tC(k+l  )H(k+1  )]?(!c+1/k) 

(A. 2. 14) 

?(0/0) 

= HO). 

(A. 2. 15) 

Also 

E[x(k+1  )/'z*(k)  ] = x(k+i)  = Ax(k)  + B°u°(k)  * 3 ’ u 1 C !< ) - 3“u2i.k) 


( A . 2 . 1 5 ) 
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Cov[x(k+l )/z*(k) ] = K(k+1)[H(k+l)P(k+1/k)HT(k+i)+H(k+1)3KT(k+1) 

(A. 2. 17) 


The  recursive  relation  defining  the  conditional  expectation  for  the 
coordinator  subsystem  is  given  by 

x°(k+1)  = 3fc°(k+l/k)  + !<°(k+1  )[z°(k+1  )-H°(k+1  )x°(k+1/k)3 

K°(k+1 ) = P°(k+1/k)HoT(k+1)CK°(k+1)?0(k+l/k)KoT(k+1)+i(k+1)] 

P°(k+1/k)  = A(k+1 )P°(k/k)A* (k+1 ) +A(k) 

P°(k+1/k+1)  = Cl-X°(k+1 ) ]?°(k+l/k) 

?°(0/0)  = £(0) 


Also 

El‘x°(k+1  )/z*(k)3  = Ax°(k)  + 5°u°(k)  + p'uUk)  + B2u2(k) 

Cov[x°(k+l )/z*(k) 3 = :<°(k+l)CH0(k+l)P(k+l/k)H0T(k+1)+Z(k+l)]K0T(k+l) 
Assume  at  stage  k the  cost-to-gc  for  the  i-th  subsystem  is 

Ji#(k)  = gxT(k)Si(k)x(k)  * 7rx(k)  (A. 2.18) 


The  optimal  strategies  for  subsystem  i are  .given  by 

ui(k)  = arg  min  S[|xT(k)Qi(k)x(k)+5UiT(k)Ri(k)u1(k)+Ji’,(k+l  )/z*(k)l 
uMk) 

(A. 2. 19) 


At  k=M 

Ji*(N)  = EC  jxT  (N)Qi(fJ)x(N)/za(M)] 

= 5XT(\')Qi(N)x(N)  + |trQi(M)Pi(N)  (A. 2. 20) 


+ Ax (k ) +3°u° ( k ) *E JuJ ( k ) 3 TSi ( k+ 1 ) 

l Ax (k ) +B°u° ( k ) +Biu" (k ) -3^u- (k) 3 
+ itrSi('<+i)Ki(,ic+i)£Ki(k+1)Pi(k*i/k)HiT(k+D+ri(jc*i)]s:l7vk>r?) 


+ j-rCk+i ) j 


(A. 2. 2D 
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The  minimising  control  u1(k)  is 

ui(k)  = -[RiCk)+BiTSi(k+l)3i]“13iTSi(k+1)[Ax(k)+B0u0(k)+Bjuj(k)3 


Recall  the  difinition  of  LI(k)  in  (A. 1.9) 
Li(k)  = CHi(k)+3iTSi(k+1)Bi]~13iTSi(k+1) 
Then 

u~(k)  = ~Li(k)[Ax(k)+B°uc(k)+3^u^(k)] 

For  2-subsystem  solve  for  u (k)  and  u (k) 


u^k)  = -A1  (k)[Ax(k)+8°u°(k) ] 
u2(k)  = -A2(k)[Ax(k)+B°u°(k) 3 
where 

Ai(k)  = C I-LiBJL-Bi]“1CLi-LiBJLi] 


(A. 2. 22) 

(A. 2. 23) 

(A. 2. 24) 

(A. 2. 25) 
(A. 2. 26) 

i= 1 ,2  i?j  (A. 2. 27) 


Assume  that  at  stage  k the  cost-to-go  for  the  coordinator  subsystem 
is 


J°*(k)  = i 

x> 

0 

1 

T 

’SA(k) 

.SBT(k) 

SB(k)' 

• x°( k ) • 

lx(k ) -x°(k ) J 

SC(k). 

,x(k)-x°(k) 

(A. 2. 28) 


At  k«N 


u°*(k)  = arg  min  Et^x1 (k)Q0(k)x(k)+5U0T(k)R0(k)u0(k)+J°*(kTl )/z°*(k)3 
u°(k) 

(A. 2. 29) 


For  any  matrix  P [3.12] 

S{xoT(k+1)rx(k+!)/s°*(k)} 

= S([ x°(k+l/k)-rK(k+i  )[s(k+l  )-H(k+l  )x°(k*1/k)3  3 1 

r[x(k+l/k)+K(ke1)[z(k+l  )-H(k+l  )x(k+i  )/'k)3  3/z°*(k)3  (A. 2. 30) 
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where 

X(k+1)  = P°(k+l/k)HT(k+1  )CH(k+'t  )?°(k+l/k)HT(k+l  )+Z(k+i )]“’  (A.  2. 30 

E{xoT(k+l)rx(k+D/2°*(k)}  = x°(k+l)Trx(k+l)  + xoT(k+l)r:<(k+l)H(k+D 

(x°(k+1  )-x(k+t ) ) + tr?°(k+-l/k)rx(k+!)H(k+1) 


(A. 2. 32) 

E[x(k+1  )rx(k+1 )/z°*(k) ] 

= E{[x(k+1/k)+K(k+1 )[z(k+1 )-H(k+1 )x(k+1/k)3 ]T 

Cx(k+l/k)+K(k+i )Cz(k+1 )-H(k+1 )x(k+l/k)] ]/z°*(k) } 

= xT(k+1  )rx(k+1 ) + 2xT(k+1  )FK(k+1 )K(k+1 ) (x°(k+l )-x(k+1 ) ) 

+ tr{rK(k+i )CH(k+i )P°(k+i/k)HT(k+i )+S(k+l ) ]KT(k+i ) 

+ ( x(k+i ) x°(k+l )THT(k+l )KT(k+l )PK(k+1 )H(k+i ) ( x(k+i )-x°(k*1 ) ) 


(A. 2. 33) 


Expand  (A. 2. 29)  using  (A. 2. 32)  and  (A. 2. 33) 


u°*(k)  = arg  rain  [pX0T(k)Q0(k)$c(k)+lu0T(k)R0(k)u0(k)+pCrQ0(k)?0(k) 
u°(k)  “ 

+ 4xoT(k+1)(SA-AC-2S8)x°(k+1)  + xoT(k+l ) (S3-SC)x(k+i ) 

+ xoT(k+1 ) (S3-SC)$C(k+1  )H(k+l ) (x°(k+l  )-x(k+1 ) ) 

+ xT(k+-l  )SCx(k+l ) + xT(k+l  )SCX(kf  1 )H(k+1 ) (x°(k+!  )-x(k+i ) ' 

+ ^(x(k+l )-x°(k+1 ) )TKT(k+l )SCT(k+1 )SCS(k+1 )H(X+1 ) 

(x(k+1 )-x°(k+l ) ) + ^rT°(k) 

+ ^tr{X°(k+l  )[H°(k+i  )?°(kfl/k) KoT (!<■*■  1 )*r°(k+l  )K<°^(k+1 ) 
(3A+SC-2SB) } + tr2P°lk+l/k)K(k+1)H(k+i)(SB-S5) 

+ ^trfC(k+i)CH(k+i)P°(k+i/k)HT(k+i)+Z(k*03KT(k*0Sc] 

(A.2.3U) 

Recall  that 

x°(k+1)  = A(k) x°(k)  - (B1(k)a,(k)A(k)+32A2vk)A(k)>x(k)  * 3(k)u°(k) 


(A. 2. 35) 
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(A. 2. 36) 


(A. 2. 37) 


wnere 

8(k)  = B°(k)  - B1(k)A1(k)B°(k)  - 32(k)A2(k)B°(k) 

Let 

G(k)  = B^kjA^k)  + B2  ( k ) A2  ( k ) 

Then  (A. 2. 35)  becomes 

x°(k+1)  = (I-G(k) )A(k)x°(k)  - G(k)A(k)(x(k)-x°(k))  + B(k)u°(k) 

(A. 2. 38) 
and 

x(k+1)  = (I-G(k) )A(k)x°(k)  - (I-G(k) )A(k) (x(k)-x°(k) ) + B(k)u°(k) 

(A. 2. 39) 

x(k+l)-x°(k+1)  = A(k) (x(k)-x°(k) ) (A. 2. 40) 

Substitue  (A.2.40)  in  (A. 2. 34)  and  differentiating  u°*(k)  is  given  by 

u°*(k)  = -A°(k)Y(k)x°(k)  - A0(k)M(k)Cx(k)-x°(k)]  (A. 2.41). 

where 

A°(k)  = [R°(k)+BT(k)SA(k+1 )B(k)]“' 

Y(k)  = B(k)SA(k+1 ) [I-G(k) ] A(k) 

M(k)  = "3T(k)SA(k+1  )C-(k)A(k)  + 3T(k)  (SB(k*1  )-SA(k+1 ) )A(k) 

- BT(k)SS(k+1)X(k+1)H(k+1)A(k) 


The  recursive  equations  for  SA,  SB,  SC,  are  obtained  by 


substituting  u°  (k)  back  in  (A.2.40) 

S"(k).  = Q°(k)  + AT(k)  (I-G(k)  )T5A(k+1 ) (I-G(k)  )A(k)  - YT(k)A°(k)Y(k) 

(A. 2. 42) 

SB(k)  = AT(k) (I-G(k) )TS3(k+? ) ( I-G(k) ) A ( k ) 

+ AF(k) (I-G(k) )T(SB(k+J )-SA(k+l ) )G(k) A(k) 

A (k)  (I-G(k)  )^SB(k-*-1  )K(k+1  )H(kfl  )A(k)  - Y^(k)A°'xk)M(k) 


(A. 2. 43) 
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SC(k)  = - Mr(k)A°(k)M(k)  + AT(k)GT(k)SA(k+1 )G(k) A(k) 

+ AT(k) [I-K(k+1 )H(k+l ) ]TSC(k+t )[I-K(k+1 )H(k+1 ) ]A(k) 

+ AT(k) (S3(k+1 )K(k+1 )H(k+1 )-S3(k+1 ) )G(k)A(k) 

- AT(k)GT(k)  (SB(k+l  )-SB(k+ 1 ):<( k+l  )H(k+1 ) ) A(k)  u.2.44) 

*°(  k)  = 'S°(k+1 ) + trQ°(k)P°(k) 

+ tr[K°(k+1 )[H°(k+l )?°(k+1/k)HoT(k+1 )+S°(k)]fCoT(k+1 ) 

(SA(k+1 )+3C(k+1 )-2S3(k+1 ) ) ] 

+ 2trP°(k+i/k)5C(k+i  )H(k+1  )f  S3(k+1  )-SC(k+l ) ) 

+ tr5C(k+1)[H(k+1  )Pa(k+1/k)HT(k+i  )+S(k-r1  )]KT(k+1  )SC(k+1 ) 

(A. 2. 45) 


To  obtain  the  recursive  equation  for  Si(k)  of  the  i-ch  subsystem, 
substitue  u°*(k),  ux(k)  back  in  (A. 2. 21) 

S^k)  = Q1(k)  + (A(k)+8(k)  A°(k) Y(k)  )TS^(k+1 ) (A{k)+5(k)A°(k)i'(k) ) 

+ (^(k)  A(k)+8°(k)A°(k)Y(k))  rRi(k) (A^(k) A(k)+8°(k)A°(k)Y(k) ) 

1 = 1.2  (A. 2. 46) 

-r(k)  = * irQ~(k)P(k)  + trSi(k*: ) 

+ trS1(k+1  )K(k+1  )[H(k+1  )P(kfl/k)H'(k+1  )+3(k+1 ) ]KT(k^- 1 ) 

+ fcr [ P ( k/k ) -P°( k/k ) ] ( M ( k ) -Y ( k ) ) TaoT ( k ) 

(8oT(k)Ri(k)5°(k)+BTS‘(k+1 )8)A°(k) (M(k)-Y(k) ) (A. 2. ^7) 
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